Llama2.jl

What is Llama2?

LLama2 is a family of pre-trained LLMs by Meta AI. More information can be found at: https://www.llama.com/

What is Llama2.jl?

Llama2.jl can inference a given model from within julia. For this cause you will have to provide your own model checkpoint. This project follows the procedure outlined by the run.c file from llama2.c.

Getting started

Clone the repository to a desired location:

cd /PATH/TO/DESIRED/LOCATION/
git clone git@github.com:ConstantConstantin/Llama2.git

Start julia, activate a desired environment and add the package; it can then be loaded in your session:

(@v1.11) pkg> activate .
  Activating new project at `PATH/TO/MY/ENVIRONMENT/myLlama2`

(myLlama2) pkg> add /PATH/TO/DESIRED/LOCATION/Llama2/
     Cloning git-repo `/PATH/TO/DESIRED/LOCATION/Llama2`
    Updating git-repo `/PATH/TO/DESIRED/LOCATION/Llama2`
    Updating registry at `~/.julia/registries/General.toml`
   Resolving package versions...
    Updating `PATH/TO/MY/ENVIRONMENT/Project.toml`
  [0e620e9f] + Llama2 v1.0.0-DEV `/PATH/TO/DESIRED/LOCATION/Llama2#aj/docs`
    Updating `PATH/TO/MY/ENVIRONMENT/Manifest.toml`
  [0e620e9f] + Llama2 v1.0.0-DEV `/PATH/TO/DESIRED/LOCATION/Llama2#aj/docs`
Precompiling project...
  1 dependency successfully precompiled in 1 seconds

julia> using Llama2
Llama2.ConfigType
Config

Create a Config containing 7 Int32. These describe meta-data to read values from an input file.

Developer Notes

This is an internal struct.

source
Llama2.RunStateType
RunState

Create a RunState containing several Float32 containers. These reflect the state of the Transformer at run-time.

Developer Notes

This is an internal struct.

source
Llama2.TokenIndexType
TokenIndex(str::String, id::Integer)

Create a TokenIndex from a string and an integer identifier.

The byte sequence is converted to String and the ID is converted to Int16. Throw a DomainError if id ≤ 0.

Examples

julia> using Llama2;

julia> TokenIndex("Julia", 1)
TokenIndex("Julia", 1)

julia> TokenIndex("Julia", -1)
ERROR: DomainError with Token index must be > 0.
[...]

Developer Notes

This is an internal struct.

source
Llama2.TokenizerType
Tokenizer

Construct a tokenizer storing vocabulary entries, scores, and byte-piece mappings.

Constructors

  • Tokenizer(vocab, vocab_scores, sorted_vocab, vocab_size, max_token_length, byte_pieces) Construct a tokenizer directly from the provided fields. Validate that max_token_length > 0 and that byte_pieces has length 256.

  • Tokenizer(path::String, vocab_size::Integer) Load a tokenizer from a binary file.

Fields

  • vocab: Token string sequences.
  • vocab_scores: Scores for each token.
  • sorted_vocab: Sorted token indices.
  • vocab_size: Number of vocabulary entries.
  • max_token_length: Maximum token length in bytes.
  • byte_pieces: Byte mapping (length 256).
source
Llama2.TransformerMethod
Transformer(path::String)

Load a binary file with location path and construct a Transformer from its content. The file is expected to have a header of 7 Int32 values followed by Float32 data.

Example

julia> t = Llama2.Transformer("/PATH/TO/YOUR.bin");
source
Llama2.TransformerWeightsType
TransformerWeights

Create a TransformerWeights containing several Float32 containers. These describe actual weight data that is loaded from an input file.

Developer Notes

This is an internal struct.

source
Llama2.compare_tokensMethod
compare_tokens(first_token::TokenIndex, second_token::TokenIndex) -> Bool

Compare two TokenIndex objects by their string values. It returns true if the first token's string is lexicographically less than the second's, and false otherwise.

Examples

julia> using Llama2;

julia> compare_tokens(TokenIndex("A", 1), TokenIndex("B", 2))
true

julia> compare_tokens(TokenIndex("B", 1), TokenIndex("A", 2))
false
source
Llama2.encodeMethod
encode

Converts a string text into a sequence of token IDs using a Tokenizer. First ensure the tokenizer's vocabulary is sorted, then encode each character into its corresponding ID. After that, iteratively merge token pairs with the highest scores to form longer tokens until no more merges are possible. Return the final token ID sequence.

source
Llama2.str_lookupMethod
str_lookup(str::String, sorted_vocab::Vector{TokenIndex}) -> Int16

Search for a given string str within a sorted vocabulary sorted_vocab of TokenIndex objects. If the string is found, it returns the corresponding token ID; otherwise, it returns -1. It uses a binary search for efficient lookup.

Examples

julia> using Llama2;

julia> str_lookup("aa", [TokenIndex("aa", 1), TokenIndex("bb", 2)])
1

julia> str_lookup("ba", [TokenIndex("aa", 1), TokenIndex("bb", 2)])
-1
source