Transformers Dynamic

Use Hugging Face Transformers for NLP, computer vision, and multimodal tasks with pre-trained models. This skill covers text classification, token classification, question answering, text generation, image classification, fine-tuning with Trainer API, and efficient inference with pipelines.

When to Use This Skill

Choose Transformers Dynamic when you need to:

Apply pre-trained language models (BERT, GPT, T5, Llama) to NLP tasks
Fine-tune models on custom datasets with the Trainer API
Run inference pipelines for text classification, NER, QA, summarization, or translation
Use vision transformers (ViT) or multimodal models (CLIP, LLaVA)

Consider alternatives when:

You need to build custom neural architectures from scratch (use PyTorch directly)
You need classical ML on tabular data (use scikit-learn)
You need to call commercial LLM APIs (use the provider's SDK)

Quick Start


pip install transformers torch datasets


from transformers import pipeline

# Text classification
classifier = pipeline("sentiment-analysis")
result = classifier("I love how easy this library is to use!")
print(f"Sentiment: {result[0]['label']} ({result[0]['score']:.3f})")

# Named entity recognition
ner = pipeline("ner", grouped_entities=True)
entities = ner("Hugging Face is based in New York City and was founded by Clément Delangue.")
for ent in entities:
    print(f"  {ent['word']}: {ent['entity_group']} ({ent['score']:.3f})")

# Text generation
generator = pipeline("text-generation", model="gpt2")
output = generator("The future of AI is", max_new_tokens=50, num_return_sequences=1)
print(f"Generated: {output[0]['generated_text']}")

# Question answering
qa = pipeline("question-answering")
answer = qa(question="What is the capital of France?",
            context="France is a country in Western Europe. Its capital is Paris.")
print(f"Answer: {answer['answer']} (score: {answer['score']:.3f})")

Core Concepts

Task Pipeline Reference

Task	Pipeline Name	Default Model	Input
Sentiment analysis	`"sentiment-analysis"`	distilbert-sst-2	Text
NER	`"ner"`	dbmdz/bert-ner	Text
Question answering	`"question-answering"`	distilbert-squad	Question + context
Summarization	`"summarization"`	facebook/bart-large-cnn	Long text
Translation	`"translation_en_to_fr"`	Helsinki-NLP/opus-mt	Text
Text generation	`"text-generation"`	gpt2	Prompt
Fill mask	`"fill-mask"`	bert-base-uncased	Text with [MASK]
Zero-shot classification	`"zero-shot-classification"`	facebook/bart-large-mnli	Text + labels
Image classification	`"image-classification"`	google/vit-base-patch16-224	Image
Object detection	`"object-detection"`	facebook/detr-resnet-50	Image

Fine-Tuning with Trainer API


from transformers import (AutoTokenizer, AutoModelForSequenceClassification,
                          TrainingArguments, Trainer)
from datasets import load_dataset
import numpy as np
from sklearn.metrics import accuracy_score, f1_score

# Load dataset
dataset = load_dataset("imdb")
tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")

def tokenize_function(examples):
    return tokenizer(examples["text"], padding="max_length",
                     truncation=True, max_length=256)

tokenized = dataset.map(tokenize_function, batched=True)

# Load model
model = AutoModelForSequenceClassification.from_pretrained(
    "distilbert-base-uncased", num_labels=2
)

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return {
        "accuracy": accuracy_score(labels, predictions),
        "f1": f1_score(labels, predictions, average="weighted"),
    }

# Training config
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=64,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
    logging_steps=100,
    eval_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    metric_for_best_model="f1",
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized["train"].select(range(5000)),  # Subset for demo
    eval_dataset=tokenized["test"].select(range(1000)),
    compute_metrics=compute_metrics,
)

trainer.train()
results = trainer.evaluate()
print(f"Eval accuracy: {results['eval_accuracy']:.3f}")
print(f"Eval F1: {results['eval_f1']:.3f}")

Configuration

Parameter	Description	Default
`model_name`	Pre-trained model identifier from Hugging Face Hub	Task-dependent
`max_length`	Maximum token sequence length	Model-specific (512)
`batch_size`	Training/inference batch size	`16`
`learning_rate`	Fine-tuning learning rate	`5e-5`
`num_train_epochs`	Training epochs	`3`
`warmup_steps`	Learning rate warmup steps	`500`
`weight_decay`	L2 regularization	`0.01`
`fp16`	Mixed-precision training	`False`
`gradient_accumulation_steps`	Gradient accumulation for effective batch size	`1`

Best Practices

Use pipelines for quick inference, Trainer for fine-tuning — Pipelines handle tokenization, batching, and post-processing automatically. Use them for prototyping and production inference. Switch to the Trainer API only when you need to fine-tune a model on custom data with full control over the training loop.
Start with the smallest model that works — DistilBERT is 60% faster than BERT with 97% of its accuracy. Start with distilled or small variants, and only scale up if performance is insufficient. For text generation, start with small models (GPT-2) before moving to larger ones (Llama, Mistral).
Set max_length based on your actual data — The default (512 or 1024) wastes memory on short texts. Analyze your data's token length distribution and set max_length to the 95th percentile. Use truncation=True to handle outliers and padding="max_length" for consistent batch shapes.
Use device_map="auto" for large models — Models larger than GPU memory can be automatically sharded across multiple GPUs or offloaded to CPU with model = AutoModel.from_pretrained("model_name", device_map="auto"). This is essential for running 7B+ parameter models on consumer hardware.
Enable mixed-precision training for 2x speedup — Set fp16=True in TrainingArguments on NVIDIA GPUs (or bf16=True on Ampere+). This halves memory usage and nearly doubles training speed with negligible accuracy impact. Always validate that training loss remains stable after enabling.

Common Issues

"CUDA out of memory" during training — Reduce per_device_train_batch_size, enable gradient_accumulation_steps to maintain effective batch size, enable fp16=True, or reduce max_length. For inference, use model.half() or load_in_8bit=True with bitsandbytes.

Tokenizer produces unexpected results — Different models use different tokenizers (WordPiece, BPE, SentencePiece). Always use the tokenizer matched to your model: AutoTokenizer.from_pretrained("model_name"). Never mix tokenizers across models — the vocabulary indices will be misaligned.

Fine-tuned model performs worse than the base model — Common causes: learning rate too high (try 1e-5 to 5e-5), insufficient data (<1000 samples — use few-shot instead), or catastrophic forgetting. Use warmup steps, lower learning rates, and evaluate after each epoch to catch overfitting early.

⚠️ Loading Issue

Transformers Dynamic

Transformers Dynamic

When to Use This Skill

Quick Start

Core Concepts

Task Pipeline Reference

Fine-Tuning with Trainer API

Configuration

Best Practices

Common Issues

Reviews

Write a review

Similar Templates

Full-Stack Code Reviewer

Test Suite Generator

Pro Architecture Workspace