T

Transformers Dynamic

Enterprise-grade skill for skill, should, used, working. Includes structured workflows, validation checks, and reusable patterns for scientific.

SkillClipticsscientificv1.0.0MIT
0 views0 copies

Transformers Dynamic

Use Hugging Face Transformers for NLP, computer vision, and multimodal tasks with pre-trained models. This skill covers text classification, token classification, question answering, text generation, image classification, fine-tuning with Trainer API, and efficient inference with pipelines.

When to Use This Skill

Choose Transformers Dynamic when you need to:

  • Apply pre-trained language models (BERT, GPT, T5, Llama) to NLP tasks
  • Fine-tune models on custom datasets with the Trainer API
  • Run inference pipelines for text classification, NER, QA, summarization, or translation
  • Use vision transformers (ViT) or multimodal models (CLIP, LLaVA)

Consider alternatives when:

  • You need to build custom neural architectures from scratch (use PyTorch directly)
  • You need classical ML on tabular data (use scikit-learn)
  • You need to call commercial LLM APIs (use the provider's SDK)

Quick Start

pip install transformers torch datasets
from transformers import pipeline # Text classification classifier = pipeline("sentiment-analysis") result = classifier("I love how easy this library is to use!") print(f"Sentiment: {result[0]['label']} ({result[0]['score']:.3f})") # Named entity recognition ner = pipeline("ner", grouped_entities=True) entities = ner("Hugging Face is based in New York City and was founded by ClΓ©ment Delangue.") for ent in entities: print(f" {ent['word']}: {ent['entity_group']} ({ent['score']:.3f})") # Text generation generator = pipeline("text-generation", model="gpt2") output = generator("The future of AI is", max_new_tokens=50, num_return_sequences=1) print(f"Generated: {output[0]['generated_text']}") # Question answering qa = pipeline("question-answering") answer = qa(question="What is the capital of France?", context="France is a country in Western Europe. Its capital is Paris.") print(f"Answer: {answer['answer']} (score: {answer['score']:.3f})")

Core Concepts

Task Pipeline Reference

TaskPipeline NameDefault ModelInput
Sentiment analysis"sentiment-analysis"distilbert-sst-2Text
NER"ner"dbmdz/bert-nerText
Question answering"question-answering"distilbert-squadQuestion + context
Summarization"summarization"facebook/bart-large-cnnLong text
Translation"translation_en_to_fr"Helsinki-NLP/opus-mtText
Text generation"text-generation"gpt2Prompt
Fill mask"fill-mask"bert-base-uncasedText with [MASK]
Zero-shot classification"zero-shot-classification"facebook/bart-large-mnliText + labels
Image classification"image-classification"google/vit-base-patch16-224Image
Object detection"object-detection"facebook/detr-resnet-50Image

Fine-Tuning with Trainer API

from transformers import (AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer) from datasets import load_dataset import numpy as np from sklearn.metrics import accuracy_score, f1_score # Load dataset dataset = load_dataset("imdb") tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased") def tokenize_function(examples): return tokenizer(examples["text"], padding="max_length", truncation=True, max_length=256) tokenized = dataset.map(tokenize_function, batched=True) # Load model model = AutoModelForSequenceClassification.from_pretrained( "distilbert-base-uncased", num_labels=2 ) def compute_metrics(eval_pred): logits, labels = eval_pred predictions = np.argmax(logits, axis=-1) return { "accuracy": accuracy_score(labels, predictions), "f1": f1_score(labels, predictions, average="weighted"), } # Training config training_args = TrainingArguments( output_dir="./results", num_train_epochs=3, per_device_train_batch_size=16, per_device_eval_batch_size=64, warmup_steps=500, weight_decay=0.01, logging_dir="./logs", logging_steps=100, eval_strategy="epoch", save_strategy="epoch", load_best_model_at_end=True, metric_for_best_model="f1", ) trainer = Trainer( model=model, args=training_args, train_dataset=tokenized["train"].select(range(5000)), # Subset for demo eval_dataset=tokenized["test"].select(range(1000)), compute_metrics=compute_metrics, ) trainer.train() results = trainer.evaluate() print(f"Eval accuracy: {results['eval_accuracy']:.3f}") print(f"Eval F1: {results['eval_f1']:.3f}")

Configuration

ParameterDescriptionDefault
model_namePre-trained model identifier from Hugging Face HubTask-dependent
max_lengthMaximum token sequence lengthModel-specific (512)
batch_sizeTraining/inference batch size16
learning_rateFine-tuning learning rate5e-5
num_train_epochsTraining epochs3
warmup_stepsLearning rate warmup steps500
weight_decayL2 regularization0.01
fp16Mixed-precision trainingFalse
gradient_accumulation_stepsGradient accumulation for effective batch size1

Best Practices

  1. Use pipelines for quick inference, Trainer for fine-tuning β€” Pipelines handle tokenization, batching, and post-processing automatically. Use them for prototyping and production inference. Switch to the Trainer API only when you need to fine-tune a model on custom data with full control over the training loop.

  2. Start with the smallest model that works β€” DistilBERT is 60% faster than BERT with 97% of its accuracy. Start with distilled or small variants, and only scale up if performance is insufficient. For text generation, start with small models (GPT-2) before moving to larger ones (Llama, Mistral).

  3. Set max_length based on your actual data β€” The default (512 or 1024) wastes memory on short texts. Analyze your data's token length distribution and set max_length to the 95th percentile. Use truncation=True to handle outliers and padding="max_length" for consistent batch shapes.

  4. Use device_map="auto" for large models β€” Models larger than GPU memory can be automatically sharded across multiple GPUs or offloaded to CPU with model = AutoModel.from_pretrained("model_name", device_map="auto"). This is essential for running 7B+ parameter models on consumer hardware.

  5. Enable mixed-precision training for 2x speedup β€” Set fp16=True in TrainingArguments on NVIDIA GPUs (or bf16=True on Ampere+). This halves memory usage and nearly doubles training speed with negligible accuracy impact. Always validate that training loss remains stable after enabling.

Common Issues

"CUDA out of memory" during training β€” Reduce per_device_train_batch_size, enable gradient_accumulation_steps to maintain effective batch size, enable fp16=True, or reduce max_length. For inference, use model.half() or load_in_8bit=True with bitsandbytes.

Tokenizer produces unexpected results β€” Different models use different tokenizers (WordPiece, BPE, SentencePiece). Always use the tokenizer matched to your model: AutoTokenizer.from_pretrained("model_name"). Never mix tokenizers across models β€” the vocabulary indices will be misaligned.

Fine-tuned model performs worse than the base model β€” Common causes: learning rate too high (try 1e-5 to 5e-5), insufficient data (<1000 samples β€” use few-shot instead), or catastrophic forgetting. Use warmup steps, lower learning rates, and evaluate after each epoch to catch overfitting early.

Community

Reviews

Write a review

No reviews yet. Be the first to review this template!

Similar Templates