Transformers Dynamic
Enterprise-grade skill for skill, should, used, working. Includes structured workflows, validation checks, and reusable patterns for scientific.
Transformers Dynamic
Use Hugging Face Transformers for NLP, computer vision, and multimodal tasks with pre-trained models. This skill covers text classification, token classification, question answering, text generation, image classification, fine-tuning with Trainer API, and efficient inference with pipelines.
When to Use This Skill
Choose Transformers Dynamic when you need to:
- Apply pre-trained language models (BERT, GPT, T5, Llama) to NLP tasks
- Fine-tune models on custom datasets with the Trainer API
- Run inference pipelines for text classification, NER, QA, summarization, or translation
- Use vision transformers (ViT) or multimodal models (CLIP, LLaVA)
Consider alternatives when:
- You need to build custom neural architectures from scratch (use PyTorch directly)
- You need classical ML on tabular data (use scikit-learn)
- You need to call commercial LLM APIs (use the provider's SDK)
Quick Start
pip install transformers torch datasets
from transformers import pipeline # Text classification classifier = pipeline("sentiment-analysis") result = classifier("I love how easy this library is to use!") print(f"Sentiment: {result[0]['label']} ({result[0]['score']:.3f})") # Named entity recognition ner = pipeline("ner", grouped_entities=True) entities = ner("Hugging Face is based in New York City and was founded by ClΓ©ment Delangue.") for ent in entities: print(f" {ent['word']}: {ent['entity_group']} ({ent['score']:.3f})") # Text generation generator = pipeline("text-generation", model="gpt2") output = generator("The future of AI is", max_new_tokens=50, num_return_sequences=1) print(f"Generated: {output[0]['generated_text']}") # Question answering qa = pipeline("question-answering") answer = qa(question="What is the capital of France?", context="France is a country in Western Europe. Its capital is Paris.") print(f"Answer: {answer['answer']} (score: {answer['score']:.3f})")
Core Concepts
Task Pipeline Reference
| Task | Pipeline Name | Default Model | Input |
|---|---|---|---|
| Sentiment analysis | "sentiment-analysis" | distilbert-sst-2 | Text |
| NER | "ner" | dbmdz/bert-ner | Text |
| Question answering | "question-answering" | distilbert-squad | Question + context |
| Summarization | "summarization" | facebook/bart-large-cnn | Long text |
| Translation | "translation_en_to_fr" | Helsinki-NLP/opus-mt | Text |
| Text generation | "text-generation" | gpt2 | Prompt |
| Fill mask | "fill-mask" | bert-base-uncased | Text with [MASK] |
| Zero-shot classification | "zero-shot-classification" | facebook/bart-large-mnli | Text + labels |
| Image classification | "image-classification" | google/vit-base-patch16-224 | Image |
| Object detection | "object-detection" | facebook/detr-resnet-50 | Image |
Fine-Tuning with Trainer API
from transformers import (AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer) from datasets import load_dataset import numpy as np from sklearn.metrics import accuracy_score, f1_score # Load dataset dataset = load_dataset("imdb") tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased") def tokenize_function(examples): return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=256) tokenized = dataset.map(tokenize_function, batched=True) # Load model model = AutoModelForSequenceClassification.from_pretrained( "distilbert-base-uncased", num_labels=2 ) def compute_metrics(eval_pred): logits, labels = eval_pred predictions = np.argmax(logits, axis=-1) return { "accuracy": accuracy_score(labels, predictions), "f1": f1_score(labels, predictions, average="weighted"), } # Training config training_args = TrainingArguments( output_dir="./results", num_train_epochs=3, per_device_train_batch_size=16, per_device_eval_batch_size=64, warmup_steps=500, weight_decay=0.01, logging_dir="./logs", logging_steps=100, eval_strategy="epoch", save_strategy="epoch", load_best_model_at_end=True, metric_for_best_model="f1", ) trainer = Trainer( model=model, args=training_args, train_dataset=tokenized["train"].select(range(5000)), # Subset for demo eval_dataset=tokenized["test"].select(range(1000)), compute_metrics=compute_metrics, ) trainer.train() results = trainer.evaluate() print(f"Eval accuracy: {results['eval_accuracy']:.3f}") print(f"Eval F1: {results['eval_f1']:.3f}")
Configuration
| Parameter | Description | Default |
|---|---|---|
model_name | Pre-trained model identifier from Hugging Face Hub | Task-dependent |
max_length | Maximum token sequence length | Model-specific (512) |
batch_size | Training/inference batch size | 16 |
learning_rate | Fine-tuning learning rate | 5e-5 |
num_train_epochs | Training epochs | 3 |
warmup_steps | Learning rate warmup steps | 500 |
weight_decay | L2 regularization | 0.01 |
fp16 | Mixed-precision training | False |
gradient_accumulation_steps | Gradient accumulation for effective batch size | 1 |
Best Practices
-
Use pipelines for quick inference, Trainer for fine-tuning β Pipelines handle tokenization, batching, and post-processing automatically. Use them for prototyping and production inference. Switch to the Trainer API only when you need to fine-tune a model on custom data with full control over the training loop.
-
Start with the smallest model that works β DistilBERT is 60% faster than BERT with 97% of its accuracy. Start with distilled or small variants, and only scale up if performance is insufficient. For text generation, start with small models (GPT-2) before moving to larger ones (Llama, Mistral).
-
Set
max_lengthbased on your actual data β The default (512 or 1024) wastes memory on short texts. Analyze your data's token length distribution and setmax_lengthto the 95th percentile. Usetruncation=Trueto handle outliers andpadding="max_length"for consistent batch shapes. -
Use
device_map="auto"for large models β Models larger than GPU memory can be automatically sharded across multiple GPUs or offloaded to CPU withmodel = AutoModel.from_pretrained("model_name", device_map="auto"). This is essential for running 7B+ parameter models on consumer hardware. -
Enable mixed-precision training for 2x speedup β Set
fp16=TrueinTrainingArgumentson NVIDIA GPUs (orbf16=Trueon Ampere+). This halves memory usage and nearly doubles training speed with negligible accuracy impact. Always validate that training loss remains stable after enabling.
Common Issues
"CUDA out of memory" during training β Reduce per_device_train_batch_size, enable gradient_accumulation_steps to maintain effective batch size, enable fp16=True, or reduce max_length. For inference, use model.half() or load_in_8bit=True with bitsandbytes.
Tokenizer produces unexpected results β Different models use different tokenizers (WordPiece, BPE, SentencePiece). Always use the tokenizer matched to your model: AutoTokenizer.from_pretrained("model_name"). Never mix tokenizers across models β the vocabulary indices will be misaligned.
Fine-tuned model performs worse than the base model β Common causes: learning rate too high (try 1e-5 to 5e-5), insufficient data (<1000 samples β use few-shot instead), or catastrophic forgetting. Use warmup steps, lower learning rates, and evaluate after each epoch to catch overfitting early.
Reviews
No reviews yet. Be the first to review this template!
Similar Templates
Full-Stack Code Reviewer
Comprehensive code review skill that checks for security vulnerabilities, performance issues, accessibility, and best practices across frontend and backend code.
Test Suite Generator
Generates comprehensive test suites with unit tests, integration tests, and edge cases. Supports Jest, Vitest, Pytest, and Go testing.
Pro Architecture Workspace
Battle-tested skill for architectural, decision, making, framework. Includes structured workflows, validation checks, and reusable patterns for development.