Building your own LLM can be a transformative step in research, product innovation, or domain-specific automation. Depending on your goals, computational resources, and available data, there are two primary paths you can take:
Choose Your Track First
Goal | Track | Description |
---|---|---|
π§ Full control, academic/research | Build from Scratch | You define architecture, tokenizer, train from raw text |
π Quick results, domain-specific | Fine-tune Pretrained LLM | Use existing models like LLaMA, Mistral, GPT-Neo |
Full Roadmap to Build Your Own LLM
1. Define Objectives
- What do you want your LLM to do? E.g., Chatbot, Q&A system, coding assistant, legal summarizer
2. Collect and Prepare Data
- Data sources: Wikipedia, books, Common Crawl, academic papers, code, chat logs
- Cleaning: remove boilerplate, HTML tags, duplicates
- Tokenization-ready corpus (plain .txt or .jsonl)
π¦ Tools: datasets, BeautifulSoup, langchain, pdfminer, Apache Tika
3. Tokenization
- Choose subword technique:
- Byte-Level BPE (GPT-2, GPT-J)
- WordPiece (BERT)
- Unigram LM (T5, XLNet)
π¦ Tools: SentencePiece, Hugging Face Tokenizers
π‘ Tip: Train your tokenizer on your dataset if youβre starting from scratch
4. Design or Choose Model Architecture
- If from scratch:
- Build transformer blocks (multi-head attention + feedforward)
- Decide: depth, width, heads, position embeddings
- If fine-tuning:
- Choose from: LLaMA 2, Mistral, GPT-Neo, BLOOM, etc.
π¦ Frameworks: PyTorch, TensorFlow, HuggingFace Transformers, nanoGPT, minGPT, Megatron-LM
5. Train or Fine-Tune
- From scratch: use massive datasets (100GB+), train on 8+ GPUs or TPUs
- Fine-tune: smaller datasets (100MBβ10GB), use parameter-efficient techniques:
- LoRA, QLoRA, PEFT, Adapters
π¦ Tools: transformers.Trainer, accelerate, DeepSpeed, Ray, ColossalAI
6. Evaluate the Model
- Tasks: Text generation, question answering, summarization
- Metrics:
- Perplexity
- BLEU, ROUGE, Exact Match
- Human evaluation
7. Save and Export
- Save model weights, tokenizer, config:
- model.save_pretrained()
- Convert to ONNX, TorchScript if needed
8. Deploy
- Build an API (Flask/FastAPI/Gradio)
- Use Streamlit or LangChain for interface
- Host on:
- Cloud (AWS, GCP, Azure)
- Local GPU server
- Hugging Face Spaces
- Docker container
9. Post-Deployment Monitoring
- Track hallucinations, drift, user feedback
- Use prompt engineering or RAG (retrieval-augmented generation) to improve relevance
βοΈ Summary Table
Step | Description | Tools |
---|---|---|
1. Define | Task + scope | β |
2. Collect Data | Clean, deduplicate | datasets, bs4, pandas |
3. Tokenize | Subword or byte-based | tokenizers, SentencePiece |
4. Model | Build or load | transformers, PyTorch, nanoGPT |
5. Train | Scratch or fine-tune | Trainer, accelerate, DeepSpeed |
6. Evaluate | Perplexity, BLEU | evaluate, custom scripts |
7. Save | Store weights, tokenizer | model.save_pretrained() |
8. Deploy | API or frontend | Streamlit, FastAPI, Docker |
9. Monitor | Feedback + improvements | LangChain, telemetry |
NLP Pipeline used to build a model
NLP Pipeline Enriched Table
Step | Type of Process | Name / Technique | Where Itβs Used | Year Developed |
---|---|---|---|---|
1 | Raw Input | Raw Text | All NLP tasks | β |
2 | Text Cleaning | Lowercasing, Stopword Removal, Lemmatization | Preprocessing, traditional NLP | 1990sβ2000s |
3 | Tokenization | Word-Level | NLTK, SpaCy, classical ML | ~2000 |
Subword BPE | GPT-2, RoBERTa | 2015 (Sennrich) | ||
WordPiece | BERT, ALBERT | 2016 (Google) | ||
Byte-Level BPE | GPT-2, GPT-Neo | 2019 (OpenAI) | ||
Unigram LM (SentencePiece) | T5, XLNet, multilingual NLP | 2018 (Google) | ||
4 | Vectorization | One-Hot Encoding | Classical ML, small DL models | ~1980sβ1990s |
TF-IDF | Text classification, IR, ML models | 1972 (Jones) | ||
Count Vectorizer | Naive Bayes, SVMs | ~1990s | ||
Word2Vec | Static embeddings | 2013 (Google) | ||
GloVe | Static embeddings | 2014 (Stanford) | ||
FastText | Static + subword embeddings | 2016 (Facebook) | ||
Transformer Embeddings | GPT, BERT, LLaMA | 2017 (Google) | ||
5 | Feature Selection | Chi-Squared, PCA, SelectKBest | Classical ML pipelines | 1990sβ2000s |
Attention-based selection | Neural LLMs (implicitly) | 2017+ | ||
6 | Modeling | Logistic Regression, SVM | Traditional ML | ~1950sβ1990s |
LSTM / GRU | RNN-based NLP | 2014β2015 | ||
Transformer (Self-Attention) | BERT, GPT, T5, LLaMA | 2017 (Vaswani) | ||
7 | Evaluation | Accuracy, F1, BLEU, Perplexity | Model assessment | Ongoing |
Notes:
- TF-IDF is one of the oldest vectorization techniques and was foundational to early information retrieval systems.
- Word2Vec, GloVe, and FastText introduced semantic similarity to embeddings.
- BPE, WordPiece, and SentencePiece are critical to subword-based tokenization used in most LLMs today.
- Transformer-based embeddings (like those in BERT, GPT) revolutionized NLP starting in 2017.
Types of Tokenization Techniques
Hereβs a categorized overview of the most popular tokenization techniques β from basic word-level to advanced subword models like Byte-Level BPE.
Whitespace/Word Tokenization
- Method: Splits on spaces and punctuation.
- Example: “Hello, world!” β [“Hello”, “,”, “world”, “!”]
- Pros: Simple, fast.
- Cons: Poor handling of unknown/misspelled words.
π‘ Used in: Traditional NLP (before deep learning era)
Character-Level Tokenization
- Method: Breaks text into individual characters.
- Example: “Chat” β [“C”, “h”, “a”, “t”]
- Pros: Handles unknown words perfectly.
- Cons: Long sequences, weak semantics.
π‘ Used in: Very small or character-sensitive models (e.g., some speech models)
Subword Tokenization (Most Common in LLMs)
a.
Byte-Pair Encoding (BPE)
- Method: Starts with characters and merges frequent pairs.
- Example: “lower”, “lowest” β [“low”, “er”], [“low”, “est”]
- Pros: Handles rare words better than word-level.
- Cons: Doesnβt consider context.
Used in: GPT-2, GPT-3, RoBERTa
b.
Byte-Level BPE
- Method: BPE applied at byte level (i.e., raw UTF-8), not characters.
- Example: “hello” β [“h”, “e”, “l”, “l”, “o”] β merged to tokens like “he”, “llo”
- Pros: Handles all languages & symbols without pre-tokenization.
- Cons: More tokens per input compared to BPE.
Used in: GPT-2, GPT-Neo, RoBERTa
c.
WordPiece
- Method: Similar to BPE but uses a greedy likelihood-based merging strategy.
- Example: “unaffordable” β [“un”, “##afford”, “##able”]
- Pros: Better modeling of morphemes.
- Cons: Vocabulary often English-biased.
Used in: BERT, ALBERT, DistilBERT
d.
Unigram Language Model (ULM)
- Method: Chooses subwords based on a probabilistic model, not just frequency.
- Example: Picks most probable tokenization among many options.
- Pros: More flexible; allows multiple ways to tokenize.
- Cons: Slightly more complex.
Used in: T5, XLNet, SentencePiece tokenizer
Byte-Level Unicode Tokenization
- Method: Tokenizes input at byte level using Unicode bytes.
- Pros: Universal for all languages, emojis, code, etc.
- Cons: Long token sequences.
Used in: GPT-J, BigScience BLOOM, newer models with multi-language support.
Character + Subword Hybrid
- Mixes character and subword tokens to balance robustness and sequence length.
Used in: Some experimental multilingual or speech models.
π Quick Comparison
Technique | Handles OOV | Language-Agnostic | Compression | Used In |
---|---|---|---|---|
Word/Whitespace | β | β | β | NLTK, SpaCy (basic) |
Char-Level | β | β | β | Speech, OCR |
BPE | β | β οΈ (not always) | β | GPT-2, RoBERTa |
Byte-Level BPE | β | β | β | GPT-2, GPT-J |
WordPiece | β | β οΈ | β | BERT, ALBERT |
Unigram LM | β | β | β | T5, XLNet |
Byte-Level BPE is widely used in GPT models because itβs compact, Unicode-friendly, and doesnβt need pre-tokenization. But other methods like WordPiece and Unigram LM are common in BERT, T5, etc.
Embeddings – hows it different with vector ??
Embeddings are dense vector representations of words, subwords, or tokens, where:
- Similar meanings β similar vectors (semantic proximity)
- Fixed size (e.g., 300-dim or 768-dim) regardless of vocab size
- Can be pretrained or learned during training
Types of Embeddings
Type | Description | Used In |
---|---|---|
Static Word Embeddings | One fixed vector per word | Word2Vec, GloVe, FastText |
Contextual Embeddings | Varies by sentence context | BERT, GPT, LLaMA, T5 |
Learned Embeddings | Initialized randomly & trained with model | Most neural models |
One-Hot Encoding vs Embeddings
Feature | One-Hot | Embeddings |
---|---|---|
Vector type | Sparse (mostly zeros) | Dense |
Semantic similarity | β No | β Yes (similar words β similar vecs) |
Size | Vocab-size dimensional | Usually 100β1000 dimensions |
Scalable to large vocab | β Poor | β Excellent |
LLMs | β Not used | β Core input layer |
Embedding in Deep Learning
When you build a model in TensorFlow or PyTorch, youβll typically have an:
tf.keras.layers.Embedding(input_dim=vocab_size, output_dim=embedding_dim)
This maps token IDs to dense vectors before passing them into transformers, RNNs, etc.
where Embedding falls
Step | Process | Embedding Falls Here? |
---|---|---|
1 | Raw Text | β |
2 | Text Cleaning | β |
3 | Tokenization | Prepares token IDs |
4 | Vectorization | β Embeddings are used here |
5 | Feature Selection | Rare, embeddings are end-to-end |
6 | Modeling | Input to neural models |
7 | Evaluation | β |
code snippets to build a basic Large Language Model (LLM) from scratch using Keras (TensorFlow). This example walks you through training a mini-GPT-style Transformer on your own text data.
This is educational-level code, not production-scale. A full LLM requires huge datasets and distributed training.
Step-by-Step: Build a Mini LLM with Keras
πΉ Step 1: Install Required Libraries
pip install tensorflow numpy
πΉ Step 2: Prepare the Dataset
For simplicity, weβll use a plain .txt file as training data. You can replace this with a large corpus.
import tensorflow as tf import numpy as np # Load text data with open("your_dataset.txt", "r", encoding="utf-8") as f: text = f.read() # Create character-level vocabulary vocab = sorted(set(text)) char2idx = {u: i for i, u in enumerate(vocab)} idx2char = np.array(vocab) # Convert to int text_as_int = np.array([char2idx[c] for c in text]) # Define sequence length seq_length = 100 examples_per_epoch = len(text) // seq_length # Create training dataset char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int) sequences = char_dataset.batch(seq_length+1, drop_remainder=True) def split_input_target(chunk): input_text = chunk[:-1] target_text = chunk[1:] return input_text, target_text dataset = sequences.map(split_input_target) # Batch size BATCH_SIZE = 64 BUFFER_SIZE = 10000 dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)
πΉ Step 3: Define the Transformer-Based Model
from tensorflow.keras import layers # Model config vocab_size = len(vocab) embedding_dim = 256 rnn_units = 1024 def build_model(vocab_size, embedding_dim, rnn_units, batch_size): model = tf.keras.Sequential([ layers.Embedding(vocab_size, embedding_dim, batch_input_shape=[batch_size, None]), layers.GRU(rnn_units, return_sequences=True, stateful=True, recurrent_initializer='glorot_uniform'), layers.Dense(vocab_size) ]) return model model = build_model(vocab_size, embedding_dim, rnn_units, BATCH_SIZE) model.summary()
πΉ Step 4: Compile and Train
def loss(labels, logits): return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True) model.compile(optimizer='adam', loss=loss) # Checkpoint callback checkpoint_callback = tf.keras.callbacks.ModelCheckpoint( filepath="./checkpoints/ckpt_{epoch}", save_weights_only=True ) EPOCHS = 10 history = model.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])
πΉ Step 5: Generate Text
To generate text, we load the trained weights and use a loop to predict one character at a time.
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1) model.load_weights(tf.train.latest_checkpoint('./checkpoints')) model.build(tf.TensorShape([1, None])) def generate_text(model, start_string, num_generate=500, temperature=1.0): input_eval = [char2idx[s] for s in start_string] input_eval = tf.expand_dims(input_eval, 0) text_generated = [] model.reset_states() for _ in range(num_generate): predictions = model(input_eval) predictions = predictions[:, -1, :] / temperature predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy() input_eval = tf.expand_dims([predicted_id], 0) text_generated.append(idx2char[predicted_id]) return start_string + ''.join(text_generated) print(generate_text(model, start_string="Once upon a time, "))
π Final Thoughts
What we Just Built | Notes |
---|---|
Mini LLM using GRU + Embedding | Can be replaced by full Transformer blocks |
Character-level generation | Can be upgraded to word/subword level with Tokenizer |
Trained on local text | For better results, use larger cleaned datasets |
π Upgrade Paths
- Use Transformer blocks (MultiHeadAttention, LayerNorm).
- Train on tokenized text using BPE (e.g., sentencepiece).
- Train with TPUs or multi-GPU setups.
- Switch to huggingface/transformers for scalable workflows.
Hereβs a rewritten version of the mini LLM from scratch using full Transformer layers in Keras β inspired by GPT-like architecture (decoder-only Transformer, causal masking, etc.).
Build a GPT-Style Transformer LLM from Scratch Using Keras
This implementation uses:
- Positional Embeddings
- Multi-Head Self-Attention
- Causal Masking
- Feedforward Layers
- Layer Normalization & Residual Connections
β Step-by-Step Guide
πΉ Step 1: Install and Import Libraries
pip install tensorflow numpy
import tensorflow as tf from tensorflow import keras from tensorflow.keras import layers import numpy as np
πΉ Step 2: Prepare Dataset (Character-Level)
# Load your dataset with open("your_dataset.txt", "r", encoding="utf-8") as f: text = f.read() # Build vocabulary vocab = sorted(set(text)) char2idx = {u: i for i, u in enumerate(vocab)} idx2char = np.array(vocab) # Encode text text_as_int = np.array([char2idx[c] for c in text]) # Create input-target pairs seq_length = 128 examples_per_epoch = len(text_as_int) // seq_length char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int) sequences = char_dataset.batch(seq_length + 1, drop_remainder=True) def split_input_target(chunk): input_text = chunk[:-1] target_text = chunk[1:] return input_text, target_text dataset = sequences.map(split_input_target) BATCH_SIZE = 64 BUFFER_SIZE = 10000 dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)
πΉ Step 3: Define Transformer Components
class PositionalEmbedding(layers.Layer): def __init__(self, vocab_size, d_model, max_len): super().__init__() self.token_emb = layers.Embedding(input_dim=vocab_size, output_dim=d_model) self.pos_emb = layers.Embedding(input_dim=max_len, output_dim=d_model) def call(self, x): maxlen = tf.shape(x)[-1] positions = tf.range(start=0, limit=maxlen, delta=1) positions = self.pos_emb(positions) x = self.token_emb(x) return x + positions class CausalSelfAttention(layers.Layer): def __init__(self, d_model, num_heads): super().__init__() self.mha = layers.MultiHeadAttention(num_heads=num_heads, key_dim=d_model) self.layernorm = layers.LayerNormalization(epsilon=1e-6) self.dropout = layers.Dropout(0.1) def call(self, x, training): attn_output = self.mha(query=x, value=x, key=x, attention_mask=self._causal_mask(tf.shape(x)[1])) attn_output = self.dropout(attn_output, training=training) return self.layernorm(x + attn_output) def _causal_mask(self, size): i = tf.range(size)[:, None] j = tf.range(size) mask = tf.cast(i >= j, dtype=tf.int32) return mask[None, None, :, :] class FeedForward(layers.Layer): def __init__(self, d_model, d_ff): super().__init__() self.seq = keras.Sequential([ layers.Dense(d_ff, activation='relu'), layers.Dense(d_model), layers.Dropout(0.1) ]) self.layernorm = layers.LayerNormalization(epsilon=1e-6) def call(self, x, training): out = self.seq(x, training=training) return self.layernorm(x + out) class TransformerBlock(layers.Layer): def __init__(self, d_model, num_heads, d_ff): super().__init__() self.att = CausalSelfAttention(d_model, num_heads) self.ff = FeedForward(d_model, d_ff) def call(self, x, training): x = self.att(x, training=training) x = self.ff(x, training=training) return x
πΉ Step 4: Define the GPT-like Model
def build_gpt_model(vocab_size, seq_len, d_model=256, num_heads=4, d_ff=512, num_layers=4): inputs = layers.Input(shape=(seq_len,)) x = PositionalEmbedding(vocab_size, d_model, seq_len)(inputs) for _ in range(num_layers): x = TransformerBlock(d_model, num_heads, d_ff)(x) outputs = layers.Dense(vocab_size)(x) return keras.Model(inputs=inputs, outputs=outputs)
πΉ Step 5: Compile & Train
model = build_gpt_model( vocab_size=len(vocab), seq_len=seq_length, d_model=256, num_heads=4, d_ff=512, num_layers=4 ) loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) model.compile(optimizer="adam", loss=loss_fn) model.summary() EPOCHS = 10 model.fit(dataset, epochs=EPOCHS)
πΉ Step 6: Generate Text
def generate_text(model, start_string, gen_length=500, temperature=1.0): input_indices = [char2idx[s] for s in start_string] input_tensor = tf.expand_dims(input_indices, 0) generated = [] for _ in range(gen_length): input_padded = tf.keras.preprocessing.sequence.pad_sequences(input_tensor, maxlen=seq_length, padding='pre') predictions = model(tf.convert_to_tensor(input_padded))[:, -1, :] predictions /= temperature predicted_id = tf.random.categorical(predictions, num_samples=1)[0, 0].numpy() input_tensor = tf.concat([input_tensor, [[predicted_id]]], axis=-1) generated.append(idx2char[predicted_id]) return start_string + ''.join(generated) print(generate_text(model, start_string="Once upon a time, "))
π§© Summary
Component | Description |
---|---|
Embedding | Token + Positional |
Transformer | Decoder-only, causal masked |
Attention | Multi-head self-attention |
Training Data | Character-based, simple .txt |
Output | Next-char prediction |
Hereβs a clean and reusable boilerplate for building a GPT-style Transformer LLM from scratch using Keras. Itβs structured into modular components and can be easily customized or extended for larger projects.
GPT-Style LLM Boilerplate (Keras / TensorFlow)
llm_gpt_keras/ βββ data/ β βββ your_dataset.txt βββ model/ β βββ transformer_blocks.py β βββ gpt_model.py βββ train.py βββ generate.py βββ requirements.txt
π
requirements.txt
tensorflow>=2.10 numpy
π
model/transformer_blocks.py
import tensorflow as tf from tensorflow.keras import layers class PositionalEmbedding(layers.Layer): def __init__(self, vocab_size, d_model, max_len): super().__init__() self.token_emb = layers.Embedding(vocab_size, d_model) self.pos_emb = layers.Embedding(max_len, d_model) def call(self, x): positions = tf.range(start=0, limit=tf.shape(x)[-1], delta=1) positions = self.pos_emb(positions) x = self.token_emb(x) return x + positions class CausalSelfAttention(layers.Layer): def __init__(self, d_model, num_heads): super().__init__() self.mha = layers.MultiHeadAttention(num_heads=num_heads, key_dim=d_model) self.dropout = layers.Dropout(0.1) self.layernorm = layers.LayerNormalization(epsilon=1e-6) def call(self, x, training): attn = self.mha(query=x, value=x, key=x, attention_mask=self._causal_mask(tf.shape(x)[1])) x = self.layernorm(x + self.dropout(attn, training=training)) return x def _causal_mask(self, size): i = tf.range(size)[:, None] j = tf.range(size) mask = tf.cast(i >= j, dtype=tf.int32) return mask[None, None, :, :] class FeedForward(layers.Layer): def __init__(self, d_model, d_ff): super().__init__() self.ff = tf.keras.Sequential([ layers.Dense(d_ff, activation='relu'), layers.Dense(d_model), layers.Dropout(0.1) ]) self.layernorm = layers.LayerNormalization(epsilon=1e-6) def call(self, x, training): return self.layernorm(x + self.ff(x, training=training)) class TransformerBlock(layers.Layer): def __init__(self, d_model, num_heads, d_ff): super().__init__() self.att = CausalSelfAttention(d_model, num_heads) self.ff = FeedForward(d_model, d_ff) def call(self, x, training): x = self.att(x, training) x = self.ff(x, training) return x
π
model/gpt_model.py
from tensorflow.keras import layers, Model, Input from model.transformer_blocks import PositionalEmbedding, TransformerBlock def build_gpt_model(vocab_size, seq_len, d_model=256, num_heads=4, d_ff=512, num_layers=4): inputs = Input(shape=(seq_len,)) x = PositionalEmbedding(vocab_size, d_model, seq_len)(inputs) for _ in range(num_layers): x = TransformerBlock(d_model, num_heads, d_ff)(x) outputs = layers.Dense(vocab_size)(x) return Model(inputs, outputs)
π
train.py
import tensorflow as tf import numpy as np from model.gpt_model import build_gpt_model # Load data with open("data/your_dataset.txt", "r", encoding="utf-8") as f: text = f.read() vocab = sorted(set(text)) char2idx = {u: i for i, u in enumerate(vocab)} idx2char = np.array(vocab) text_as_int = np.array([char2idx[c] for c in text]) seq_length = 128 examples_per_epoch = len(text_as_int) // (seq_length + 1) char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int) sequences = char_dataset.batch(seq_length + 1, drop_remainder=True) def split_input_target(chunk): return chunk[:-1], chunk[1:] dataset = sequences.map(split_input_target).shuffle(10000).batch(64, drop_remainder=True) # Model setup model = build_gpt_model( vocab_size=len(vocab), seq_len=seq_length, d_model=256, num_heads=4, d_ff=512, num_layers=4 ) model.compile(optimizer="adam", loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)) model.fit(dataset, epochs=10) model.save_weights("checkpoints/gpt_small.h5")
π
generate.py
import tensorflow as tf from model.gpt_model import build_gpt_model import numpy as np # Load vocabulary with open("data/your_dataset.txt", "r", encoding="utf-8") as f: text = f.read() vocab = sorted(set(text)) char2idx = {u: i for i, u in enumerate(vocab)} idx2char = np.array(vocab) seq_length = 128 model = build_gpt_model(len(vocab), seq_length) model.load_weights("checkpoints/gpt_small.h5") def generate_text(start_string, num_generate=500, temperature=1.0): input_eval = [char2idx[c] for c in start_string] input_eval = tf.expand_dims(input_eval, 0) generated = [] for _ in range(num_generate): input_padded = tf.keras.preprocessing.sequence.pad_sequences( input_eval, maxlen=seq_length, padding='pre' ) predictions = model(tf.convert_to_tensor(input_padded))[:, -1, :] predictions = predictions / temperature predicted_id = tf.random.categorical(predictions, num_samples=1)[0, 0].numpy() input_eval = tf.concat([input_eval, [[predicted_id]]], axis=-1) generated.append(idx2char[predicted_id]) return start_string + ''.join(generated) print(generate_text("Once upon a time, "))