Skip to content

FlanT5

FlanT5

FlanT5(model_id: Optional[str] = None)

Bases: FoundationalModel

Flan-T5 is a instruction-tuned text-to-text generation model from Google. It is built atop the generic T5 model.

Examples:

Invoking default Flan T5

from baseten.models import FlanT5
model = FlanT5()
model("The quick brown fox jumps over the lazy dog.")

Setting bad words

For production apps, you may find it useful to set a list of "bad words" which tell the model what words it cannot use during generation. Blueprint supports this by letting you pass that list during instantiation of the model.

model = FlanT5()
model("The quick brown fox jumps over the lazy dog.", bad_words=["dog"])
model = FlanT5()
model.bad_words = ["dog"]
model("The quick brown fox jumps over the lazy dog.")
model = FlanT5()
model.bad_words = ["dog"]
model("The quick brown fox jumps over the lazy dog.", num_beams=4)

Attributes:

Name Type Description
bad_words list

List of words to avoid in the output.

__call__

__call__(prompt, seed = None, **kwargs) -> str

Generate text from a prompt. Supports all parameters of the underlying model from Huggingface library. The below are some of the parameters that can be passed in. See Huggingface Transformers .generate() documentation for more details.

Parameters:

Name Type Description Default
prompt str

The prompt to generate text from.

required
seed int

The random seed to use for reproducibility. Optional.

None

Attributes:

Name Type Description
num_beams int

Number of beams for beam search. 1 means no beam search. Optional. Default to 1.

num_return_sequences int

The number of independently computed returned sequences for each element in the batch. Default to 1.

max_length int

The maximum number of tokens that can be generated.

temperature float

The value used to module the next token probabilities. Must be strictly positive. Default to 1.0.

top_k int

The number of highest probability vocabulary tokens to keep for top-k-filtering. Between 1 and infinity. Default to 50.

top_p float

If set to float < 1, only the most probable tokens with probabilities that add up to top_p or higher are kept for generation. Default to 1.0.

repetition_penalty float

The parameter for repetition penalty. Between 1.0 and infinity. 1.0 means no penalty. Default to 1.0.

length_penalty float

Exponential penalty to the sequence length. Default to 1.0.

no_repeat_ngram_size int

If set to int > 0, all ngrams of that size can only occur once. Default to 0.

num_return_sequences int

The number of independently computed returned sequences for each element in the batch. Default to 1.

do_sample bool

If set to False greedy decoding is used. Otherwise sampling is used. Defaults to True.

early_stopping bool

Whether to stop the beam search when at least num_beams sentences are finished per batch or not. Default to False.

use_cache bool

Whether or not the model should use the past last key/values attentions (if applicable to the model) to speed up decoding. Default to True.

decoder_start_token_id int

If an encoder-decoder model starts decoding with a different token than BOS, the id of that token. Default to None.

pad_token_id int

The id of the padding token. Default to None.

eos_token_id int

The id of the end of sequence token. Default to None.

forced_bos_token_id int

The id of the token to force as the first generated token after the BOS token. Default to None.

forced_eos_token_id int

The id of the token to force as the last generated token when max_length is reached. Default to None.

remove_invalid_values bool

Whether or not to remove possible nan and inf outputs of the model to prevent the generation method to crash. Default to False.

Returns:

Name Type Description
generated_text str

The generated text.

url staticmethod

url()

Use this static method to get a URL to the Stable Diffusion model page in your Blueprint project, which contains information about the model.

Example:

from baseten.models import FlanT5

FlanT5.url()