【LLaMa2入门】从零开始训练LLaMa2

目录

  • 1 背景
  • 2 搭建环境
    • 2.1 硬件配置
    • 2.2 搭建虚拟环境
      • 2.2.1 创建虚拟环境
      • 2.2.2 安装所需的库
  • 3 准备工作
    • 3.1 下载GitHub代码
    • 3.2 下载模型
    • 3.3 数据处理
      • 3.3.1 下载数据
      • 3.3.2 数据集tokenize预处理
  • 4 训练
    • 4.1 修改配置
    • 4.2 开始训练
    • 4.3 多机多卡训练
  • 5 模型推理
    • 5.1 编译
      • 5.1.1 安装gcc
      • 5.1.2 修改run.c代码
      • 5.1.3 生成可执行文件
    • 5.2 模型推理
  • 6 结束语

1 背景

从零开始学大模型之——LLaMa2-7B。

2 搭建环境

anaconda的按照教程请参考:
Linux安装conda
conda离线安装pytorch

2.1 硬件配置

系统:windows 11
内存:48GB
显卡:RTX 4070,12GB
处理器:i5-13600KF

2.2 搭建虚拟环境

2.2.1 创建虚拟环境

conda create --name llama2 python=3.10

2.2.2 安装所需的库

安装torch:

conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.8 -c pytorch -c nvidia

安装其他库:

pip install numpy==1.23.5 pytest Requests sentencepiece tqdm wandb

3 准备工作

3.1 下载GitHub代码

git clone https://github.com/karpathy/llama2.c.git

3.2 下载模型

到huggingface网站上下载Llama-2-7b-chat-hf模型。
需要科学上网,然后注册账号,申请权限通过后即可下载。

由于模型数据都来源于huggingface,从huggingface下载模型权重和训练数据过程中可能会遇到各种网络问题,可以下载时通过走huggingface的国内镜像hf-mirror加快下载速度,然后使用huggingface-cli进行模型文件和数据的下载。

具体操作请参考网上其他资料,此处仅提供部分操作代码:

# 下载huggingface_hub pip install -U huggingface_hub# 设置环境变量export HF_ENDPOINT=https://hf-mirror.com# 下载模型,指定模型的保存位置huggingface-cli download --resume-download NousResearch/Meta-Llama-3-8B --local-dir ./model/Meta-Llama-3-8B

3.3 数据处理

3.3.1 下载数据

python tinystories.py download

3.3.2 数据集tokenize预处理

python tinystories.py pretokenize

4 训练

4.1 修改配置

train.py里面有几个参数要修改
batch_size改小一点,否则会报’CUDA out of memory’ 的错误;
dtype要改为"float16",否则会报’Current CUDA Device does not support bfloat16’的错误;
compile要改为False,否则会报CUDA Capability过低或complex64不支持的错误。

batch_size = 64
dtype = "float16"
compile = False

可选改的参数:
max_iters:是迭代次数,可改小一点。
warmup_iters:是热身的迭代次数,主要是为了确定合适得学习率,卡有限的话可改小一些。

max_iters = 100000
warmup_iters = 1000

4.2 开始训练

python train.py

我设置的参数:

batch_size = 128  # if gradient_accumulation_steps > 1, this is the micro-batch size
max_iters = 100000  # total number of training iterations
warmup_iters = 1000  # how many steps to warm up for
dtype = "float16"  # float32|bfloat16|float16
compile = False  # use PyTorch 2.0 to compile the model to be faster

Loss:
训练loss情况
显存占用情况:
显存占用情况

训练结束后,会在out文件夹下保存ckpt:
模型

4.3 多机多卡训练

可以在命令行中指定训练参数(单GPU上训练):

python -m train.py --compile=False --eval_iters=100 --batch_size=64

如果是多GPU,可采用分布式训练,例如采用DDP 在1个node,2个 gpu 上训练:

torchrun --standalone --nproc_per_node=2 train.py

采用DDP 在2个node,8个 gpu 上训练:

torchrun --nproc_per_node=4 --nnodes=2 --node_rank=0 --master_addr=123.456.123.456 --master_port=1234 train.py

5 模型推理

注意:Linux跑没问题,Windows跑没输出,暂时还没解决Windows的问题。

5.1 编译

如果是Linux,直接make即可。

gcc run.c -o run -lm

如果是windows,需要先安装gcc,然后修改run.c代码,最后再生成可执行文件。

5.1.1 安装gcc

在gcc官网下载压缩文件,解压到本地。
gcc

配置环境变量:

  • 解压完成后,需要将MinGW-w64的bin目录添加到系统的PATH环境变量中。
  • 打开“控制面板”,选择“系统和安全”,然后选择“系统”。
  • 点击左侧的“高级系统设置”,在弹出的对话框中点击“环境变量”。
  • 在“系统变量”区域,找到Path变量,选中后点击“编辑”。
  • 点击“新建”,然后将MinGW-w64的bin目录路径添加进去,例如C:\Program Files\mingw-w64\mingw64\bin。
  • 确认所有对话框并关闭。

测试:在终端输入

gcc --version

没有报错说明安装成功。
gcc版本

5.1.2 修改run.c代码

在 Windows 系统上,由于 mmap 和 munmap 不是标准的 Win32 API 函数,需要使用相应的 Win32 函数来实现类似的功能。可以使用 CreateFileMapping 和 MapViewOfFile 来代替 mmap,使用 UnmapViewOfFile 来代替 munmap。
将以下代码拷贝覆盖原文件即可。

/* Inference for Llama-2 Transformer model in pure C */#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>
#include <time.h>
#include <math.h>
#include <string.h>
#include <fcntl.h>
#if defined _WIN32#include "win.h"
#else#include <unistd.h>#include <sys/mman.h>
#endif
// ----------------------------------------------------------------------------
// Transformer modeltypedef struct {int dim; // transformer dimensionint hidden_dim; // for ffn layersint n_layers; // number of layersint n_heads; // number of query headsint n_kv_heads; // number of key/value heads (can be < query heads because of multiquery)int vocab_size; // vocabulary size, usually 256 (byte-level)int seq_len; // max sequence length
} Config;typedef struct {// token embedding tablefloat* token_embedding_table;    // (vocab_size, dim)// weights for rmsnormsfloat* rms_att_weight; // (layer, dim) rmsnorm weightsfloat* rms_ffn_weight; // (layer, dim)// weights for matmuls. note dim == n_heads * head_sizefloat* wq; // (layer, dim, n_heads * head_size)float* wk; // (layer, dim, n_kv_heads * head_size)float* wv; // (layer, dim, n_kv_heads * head_size)float* wo; // (layer, n_heads * head_size, dim)// weights for ffnfloat* w1; // (layer, hidden_dim, dim)float* w2; // (layer, dim, hidden_dim)float* w3; // (layer, hidden_dim, dim)// final rmsnormfloat* rms_final_weight; // (dim,)// (optional) classifier weights for the logits, on the last layerfloat* wcls;
} TransformerWeights;typedef struct {// current wave of activationsfloat *x; // activation at current time stamp (dim,)float *xb; // same, but inside a residual branch (dim,)float *xb2; // an additional buffer just for convenience (dim,)float *hb; // buffer for hidden dimension in the ffn (hidden_dim,)float *hb2; // buffer for hidden dimension in the ffn (hidden_dim,)float *q; // query (dim,)float *k; // key (dim,)float *v; // value (dim,)float *att; // buffer for scores/attention values (n_heads, seq_len)float *logits; // output logits// kv cachefloat* key_cache;   // (layer, seq_len, dim)float* value_cache; // (layer, seq_len, dim)
} RunState;typedef struct {Config config; // the hyperparameters of the architecture (the blueprint)TransformerWeights weights; // the weights of the modelRunState state; // buffers for the "wave" of activations in the forward pass// some more state needed to properly clean up the memory mapping (sigh)HANDLE fileHandle; // file handle for memory mappingvoid* data; // memory mapped data pointerssize_t file_size; // size of the checkpoint file in bytes
} Transformer;void malloc_run_state(RunState* s, Config* p) {// we calloc instead of malloc to keep valgrind happyint kv_dim = (p->dim * p->n_kv_heads) / p->n_heads;s->x = calloc(p->dim, sizeof(float));s->xb = calloc(p->dim, sizeof(float));s->xb2 = calloc(p->dim, sizeof(float));s->hb = calloc(p->hidden_dim, sizeof(float));s->hb2 = calloc(p->hidden_dim, sizeof(float));s->q = calloc(p->dim, sizeof(float));s->key_cache = calloc(p->n_layers * p->seq_len * kv_dim, sizeof(float));s->value_cache = calloc(p->n_layers * p->seq_len * kv_dim, sizeof(float));s->att = calloc(p->n_heads * p->seq_len, sizeof(float));s->logits = calloc(p->vocab_size, sizeof(float));// ensure all mallocs went fineif (!s->x || !s->xb || !s->xb2 || !s->hb || !s->hb2 || !s->q|| !s->key_cache || !s->value_cache || !s->att || !s->logits) {fprintf(stderr, "malloc failed!\n");exit(EXIT_FAILURE);}
}void free_run_state(RunState* s) {free(s->x);free(s->xb);free(s->xb2);free(s->hb);free(s->hb2);free(s->q);free(s->att);free(s->logits);free(s->key_cache);free(s->value_cache);
}void memory_map_weights(TransformerWeights *w, Config* p, float* ptr, int shared_weights) {int head_size = p->dim / p->n_heads;// make sure the multiplications below are done in 64bit to fit the parameter counts of 13B+ modelsunsigned long long n_layers = p->n_layers;w->token_embedding_table = ptr;ptr += p->vocab_size * p->dim;w->rms_att_weight = ptr;ptr += n_layers * p->dim;w->wq = ptr;ptr += n_layers * p->dim * (p->n_heads * head_size);w->wk = ptr;ptr += n_layers * p->dim * (p->n_kv_heads * head_size);w->wv = ptr;ptr += n_layers * p->dim * (p->n_kv_heads * head_size);w->wo = ptr;ptr += n_layers * (p->n_heads * head_size) * p->dim;w->rms_ffn_weight = ptr;ptr += n_layers * p->dim;w->w1 = ptr;ptr += n_layers * p->dim * p->hidden_dim;w->w2 = ptr;ptr += n_layers * p->hidden_dim * p->dim;w->w3 = ptr;ptr += n_layers * p->dim * p->hidden_dim;w->rms_final_weight = ptr;ptr += p->dim;ptr += p->seq_len * head_size / 2; // skip what used to be freq_cis_real (for RoPE)ptr += p->seq_len * head_size / 2; // skip what used to be freq_cis_imag (for RoPE)w->wcls = shared_weights ? w->token_embedding_table : ptr;
}void read_checkpoint(char* checkpoint, Config* config, TransformerWeights* weights,HANDLE* fileHandle, void** data, ssize_t* file_size) {// read in the config headerFILE *file = fopen(checkpoint, "rb");if (!file) { fprintf(stderr, "Couldn't open file %s\n", checkpoint); exit(EXIT_FAILURE); }if (fread(config, sizeof(Config), 1, file) != 1) { exit(EXIT_FAILURE); }// negative vocab size is hacky way of signaling unshared weights. bit yikes.int shared_weights = config->vocab_size > 0 ? 1 : 0;config->vocab_size = abs(config->vocab_size);// figure out the file sizefseek(file, 0, SEEK_END); // move file pointer to end of file*file_size = ftell(file); // get the file size, in bytesfclose(file);// open the file*fileHandle = CreateFile(checkpoint, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);if (*fileHandle == INVALID_HANDLE_VALUE) {fprintf(stderr, "open failed!\n");exit(EXIT_FAILURE);}// create a file mapping objectHANDLE hMapFile = CreateFileMapping(*fileHandle, NULL, PAGE_READONLY, 0, 0, NULL);if (hMapFile == NULL) {fprintf(stderr, "CreateFileMapping failed!\n");CloseHandle(*fileHandle);exit(EXIT_FAILURE);}// map the view of the file*data = MapViewOfFile(hMapFile, FILE_MAP_READ, 0, 0, 0);if (*data == NULL) {fprintf(stderr, "MapViewOfFile failed!\n");CloseHandle(hMapFile);CloseHandle(*fileHandle);exit(EXIT_FAILURE);}float* weights_ptr = *data + sizeof(Config)/sizeof(float);memory_map_weights(weights, config, weights_ptr, shared_weights);
}void build_transformer(Transformer *t, char* checkpoint_path) {// read in the Config and the Weights from the checkpointread_checkpoint(checkpoint_path, &t->config, &t->weights, &t->fileHandle, &t->data, &t->file_size);// allocate the RunState buffersmalloc_run_state(&t->state, &t->config);
}void free_transformer(Transformer* t) {// close the memory mappingif (t->data != NULL) { UnmapViewOfFile(t->data); }if (t->fileHandle != INVALID_HANDLE_VALUE) { CloseHandle(t->fileHandle); }// free the RunState buffersfree_run_state(&t->state);
}// ----------------------------------------------------------------------------
// neural net blocks; the dynamics of the Transformervoid rmsnorm(float* o, float* x, float* weight, int size) {// calculate sum of squaresfloat ss = 0.0f;for (int j = 0; j < size; j++) {ss += x[j] * x[j];}ss /= size;ss += 1e-5f;ss = 1.0f / sqrtf(ss);// normalize and scalefor (int j = 0; j < size; j++) {o[j] = weight[j] * (ss * x[j]);}
}void softmax(float* x, int size) {// find max value (for numerical stability)float max_val = x[0];for (int i = 1; i < size; i++) {if (x[i] > max_val) {max_val = x[i];}}// exp and sumfloat sum = 0.0f;for (int i = 0; i < size; i++) {x[i] = expf(x[i] - max_val);sum += x[i];}// normalizefor (int i = 0; i < size; i++) {x[i] /= sum;}
}void matmul(float* xout, float* x, float* w, int n, int d) {// W (d,n) @ x (n,) -> xout (d,)// by far the most amount of time is spent inside this little functionint i;#pragma omp parallel for private(i)for (i = 0; i < d; i++) {float val = 0.0f;for (int j = 0; j < n; j++) {val += w[i * n + j] * x[j];}xout[i] = val;}
}float* forward(Transformer* transformer, int token, int pos) {// a few convenience variablesConfig* p = &transformer->config;TransformerWeights* w = &transformer->weights;RunState* s = &transformer->state;float *x = s->x;int dim = p->dim;int kv_dim = (p->dim * p->n_kv_heads) / p->n_heads;int kv_mul = p->n_heads / p->n_kv_heads; // integer multiplier of the kv sharing in multiqueryint hidden_dim =  p->hidden_dim;int head_size = dim / p->n_heads;// copy the token embedding into xfloat* content_row = w->token_embedding_table + token * dim;memcpy(x, content_row, dim*sizeof(*x));// forward all the layersfor(unsigned long long l = 0; l < p->n_layers; l++) {// attention rmsnormrmsnorm(s->xb, x, w->rms_att_weight + l*dim, dim);// key and value point to the kv cacheint loff = l * p->seq_len * kv_dim; // kv cache layer offset for conveniences->k = s->key_cache + loff + pos * kv_dim;s->v = s->value_cache + loff + pos * kv_dim;// qkv matmuls for this positionmatmul(s->q, s->xb, w->wq + l*dim*dim, dim, dim);matmul(s->k, s->xb, w->wk + l*dim*kv_dim, dim, kv_dim);matmul(s->v, s->xb, w->wv + l*dim*kv_dim, dim, kv_dim);// RoPE relative positional encoding: complex-valued rotate q and k in each headfor (int i = 0; i < dim; i+=2) {int head_dim = i % head_size;float freq = 1.0f / powf(10000.0f, head_dim / (float)head_size);float val = pos * freq;float fcr = cosf(val);float fci = sinf(val);int rotn = i < kv_dim ? 2 : 1; // how many vectors? 2 = q & k, 1 = q onlyfor (int v = 0; v < rotn; v++) {float* vec = v == 0 ? s->q : s->k; // the vector to rotate (query or key)float v0 = vec[i];float v1 = vec[i+1];vec[i]   = v0 * fcr - v1 * fci;vec[i+1] = v0 * fci + v1 * fcr;}}// multihead attention. iterate over all headsint h;#pragma omp parallel for private(h)for (h = 0; h < p->n_heads; h++) {// get the query vector for this headfloat* q = s->q + h * head_size;// attention scores for this headfloat* att = s->att + h * p->seq_len;// iterate over all timesteps, including the current onefor (int t = 0; t <= pos; t++) {// get the key vector for this head and at this timestepfloat* k = s->key_cache + loff + t * kv_dim + (h / kv_mul) * head_size;// calculate the attention score as the dot product of q and kfloat score = 0.0f;for (int i = 0; i < head_size; i++) {score += q[i] * k[i];}score /= sqrtf(head_size);// save the score to the attention bufferatt[t] = score;}// softmax the scores to get attention weights, from 0..pos inclusivelysoftmax(att, pos + 1);// weighted sum of the values, store back into xbfloat* xb = s->xb + h * head_size;memset(xb, 0, head_size * sizeof(float));for (int t = 0; t <= pos; t++) {// get the value vector for this head and at this timestepfloat* v = s->value_cache + loff + t * kv_dim + (h / kv_mul) * head_size;// get the attention weight for this timestepfloat a = att[t];// accumulate the weighted value into xbfor (int i = 0; i < head_size; i++) {xb[i] += a * v[i];}}}// final matmul to get the output of the attentionmatmul(s->xb2, s->xb, w->wo + l*dim*dim, dim, dim);// residual connection back into xfor (int i = 0; i < dim; i++) {x[i] += s->xb2[i];}// ffn rmsnormrmsnorm(s->xb, x, w->rms_ffn_weight + l*dim, dim);// Now for FFN in PyTorch we have: self.w2(F.silu(self.w1(x)) * self.w3(x))// first calculate self.w1(x) and self.w3(x)matmul(s->hb, s->xb, w->w1 + l*dim*hidden_dim, dim, hidden_dim);matmul(s->hb2, s->xb, w->w3 + l*dim*hidden_dim, dim, hidden_dim);// SwiGLU non-linearityfor (int i = 0; i < hidden_dim; i++) {float val = s->hb[i];// silu(x)=x*σ(x), where σ(x) is the logistic sigmoidval *= (1.0f / (1.0f + expf(-val)));// elementwise multiply with w3(x)val *= s->hb2[i];s->hb[i] = val;}// final matmul to get the output of the ffnmatmul(s->xb, s->hb, w->w2 + l*dim*hidden_dim, hidden_dim, dim);// residual connectionfor (int i = 0; i < dim; i++) {x[i] += s->xb[i];}}// final rmsnormrmsnorm(x, x, w->rms_final_weight, dim);// classifier into logitsmatmul(s->logits, x, w->wcls, p->dim, p->vocab_size);return s->logits;
}// ----------------------------------------------------------------------------
// The Byte Pair Encoding (BPE) Tokenizer that translates strings <-> tokenstypedef struct {char *str;int id;
} TokenIndex;typedef struct {char** vocab;float* vocab_scores;TokenIndex *sorted_vocab;int vocab_size;unsigned int max_token_length;unsigned char byte_pieces[512]; // stores all single-byte strings
} Tokenizer;int compare_tokens(const void *a, const void *b) {return strcmp(((TokenIndex*)a)->str, ((TokenIndex*)b)->str);
}void build_tokenizer(Tokenizer* t, char* tokenizer_path, int vocab_size) {// i should have written the vocab_size into the tokenizer file... sight->vocab_size = vocab_size;// malloc space to hold the scores and the stringst->vocab = (char**)malloc(vocab_size * sizeof(char*));t->vocab_scores = (float*)malloc(vocab_size * sizeof(float));t->sorted_vocab = NULL; // initialized lazilyfor (int i = 0; i < 256; i++) {t->byte_pieces[i * 2] = (unsigned char)i;t->byte_pieces[i * 2 + 1] = '\0';}// read in the fileFILE *file = fopen(tokenizer_path, "rb");if (!file) { fprintf(stderr, "couldn't load %s\n", tokenizer_path); exit(EXIT_FAILURE); }if (fread(&t->max_token_length, sizeof(int), 1, file) != 1) { fprintf(stderr, "failed read\n"); exit(EXIT_FAILURE); }int len;for (int i = 0; i < vocab_size; i++) {if (fread(t->vocab_scores + i, sizeof(float), 1, file) != 1) { fprintf(stderr, "failed read\n"); exit(EXIT_FAILURE);}if (fread(&len, sizeof(int), 1, file) != 1) { fprintf(stderr, "failed read\n"); exit(EXIT_FAILURE); }t->vocab[i] = (char *)malloc(len + 1);if (fread(t->vocab[i], len, 1, file) != 1) { fprintf(stderr, "failed read\n"); exit(EXIT_FAILURE); }t->vocab[i][len] = '\0'; // add the string terminating token}fclose(file);
}void free_tokenizer(Tokenizer* t) {for (int i = 0; i < t->vocab_size; i++) { free(t->vocab[i]); }free(t->vocab);free(t->vocab_scores);free(t->sorted_vocab);
}char* decode(Tokenizer* t, int prev_token, int token) {char *piece = t->vocab[token];// following BOS (1) token, sentencepiece decoder strips any leading whitespace (see PR #89)if (prev_token == 1 && piece[0] == ' ') { piece++; }// careful, some tokens designate raw bytes, and look like e.g. '<0x01>'// parse this and convert and return the actual byteunsigned char byte_val;if (sscanf(piece, "<0x%02hhX>", &byte_val) == 1) {piece = (char*)t->byte_pieces + byte_val * 2;}return piece;
}void safe_printf(char *piece) {// piece might be a raw byte token, and we only want to print printable chars or whitespace// because some of the other bytes can be various control codes, backspace, etc.if (piece == NULL) { return; }if (piece[0] == '\0') { return; }if (piece[1] == '\0') {unsigned char byte_val = piece[0];if (!(isprint(byte_val) || isspace(byte_val))) {return; // bad byte, don't print it}}printf("%s", piece);
}int str_lookup(char *str, TokenIndex *sorted_vocab, int vocab_size) {// efficiently find the perfect match for str in vocab, return its index or -1 if not foundTokenIndex tok = { .str = str }; // acts as the key to search forTokenIndex *res = bsearch(&tok, sorted_vocab, vocab_size, sizeof(TokenIndex), compare_tokens);return res != NULL ? res->id : -1;
}void encode(Tokenizer* t, char *text, int8_t bos, int8_t eos, int *tokens, int *n_tokens) {// encode the string text (input) into an upper-bound preallocated tokens[] array// bos != 0 means prepend the BOS token (=1), eos != 0 means append the EOS token (=2)if (text == NULL) { fprintf(stderr, "cannot encode NULL text\n"); exit(EXIT_FAILURE); }if (t->sorted_vocab == NULL) {// lazily malloc and sort the vocabularyt->sorted_vocab = malloc(t->vocab_size * sizeof(TokenIndex));for (int i = 0; i < t->vocab_size; i++) {t->sorted_vocab[i].str = t->vocab[i];t->sorted_vocab[i].id = i;}qsort(t->sorted_vocab, t->vocab_size, sizeof(TokenIndex), compare_tokens);}// create a temporary buffer that will store merge candidates of always two consecutive tokens// *2 for concat, +1 for null terminator +2 for UTF8 (in case max_token_length is 1)char* str_buffer = malloc((t->max_token_length*2 +1 +2) * sizeof(char));size_t str_len = 0;// start at 0 tokens*n_tokens = 0;// add optional BOS (=1) token, if desiredif (bos) tokens[(*n_tokens)++] = 1;// add_dummy_prefix is true by default// so prepend a dummy prefix token to the input string, but only if text != ""// TODO: pretty sure this isn't correct in the general case but I don't have the// energy to read more of the sentencepiece code to figure out what it's doingif (text[0] != '\0') {int dummy_prefix = str_lookup(" ", t->sorted_vocab, t->vocab_size);tokens[(*n_tokens)++] = dummy_prefix;}// Okay UTF-8 time. This will get messy. Here is the reference from Wikipedia:// Code point ↔ UTF-8 conversion// First code point	Last code point	Byte 1	Byte 2	Byte 3	Byte 4// U+0000	U+007F	    0xxxxxxx// U+0080	U+07FF	    110xxxxx	10xxxxxx// U+0800	U+FFFF	    1110xxxx	10xxxxxx	10xxxxxx// U+10000	U+10FFFF    11110xxx	10xxxxxx	10xxxxxx	10xxxxxx// process the raw (UTF-8) byte sequence of the input stringfor (char *c = text; *c != '\0'; c++) {// reset buffer if the current byte is ASCII or a leading byte// 0xC0 is 11000000, so (*c & 0xC0) keeps the first 2 bits and zeros the rest// 0x80 is 10000000// in UTF-8, all continuation bytes start with "10" in first two bits// so in English this is: "if this byte is not a continuation byte"if ((*c & 0xC0) != 0x80) {// this byte must be either a leading byte (11...) or an ASCII char (0x...)// => reset our location, as we're starting a new UTF-8 codepointstr_len = 0;}// append the current byte to the bufferstr_buffer[str_len++] = *c; // ++ is post-increment, incremented after this linestr_buffer[str_len] = '\0';// while the next character is a continuation byte, continue appending// but if there are too many of them, just stop to avoid overruning str_buffer size.if ((*(c+1) & 0xC0) == 0x80 && str_len < 4) {continue;}// ok c+1 is not a continuation byte, so we've read in a full codepointint id = str_lookup(str_buffer, t->sorted_vocab, t->vocab_size);if (id != -1) {// we found this codepoint in vocab, add it as a tokentokens[(*n_tokens)++] = id;} else {// byte_fallback encoding: just encode each byte as a token// +3 is here because the first 3 vocab elements are <unk>, <s>, </s>// so the individual bytes only start at index 3for (int i=0; i < str_len; i++) {tokens[(*n_tokens)++] = (unsigned char)str_buffer[i] + 3;}}str_len = 0; // protect against a sequence of stray UTF8 continuation bytes}// merge the best consecutive pair each iteration, according the scores in vocab_scoreswhile (1) {float best_score = -1e10;int best_id = -1;int best_idx = -1;for (int i=0; i < (*n_tokens-1); i++) {// check if we can merge the pair (tokens[i], tokens[i+1])sprintf(str_buffer, "%s%s", t->vocab[tokens[i]], t->vocab[tokens[i+1]]);int id = str_lookup(str_buffer, t->sorted_vocab, t->vocab_size);if (id != -1 && t->vocab_scores[id] > best_score) {// this merge pair exists in vocab! record its score and positionbest_score = t->vocab_scores[id];best_id = id;best_idx = i;}}if (best_idx == -1) {break; // we couldn't find any more pairs to merge, so we're done}// merge the consecutive pair (best_idx, best_idx+1) into new token best_idtokens[best_idx] = best_id;// delete token at position best_idx+1, shift the entire sequence back 1for (int i = best_idx+1; i < (*n_tokens-1); i++) {tokens[i] = tokens[i+1];}(*n_tokens)--; // token length decreased}// add optional EOS (=2) token, if desiredif (eos) tokens[(*n_tokens)++] = 2;free(str_buffer);
}// ----------------------------------------------------------------------------
// The Sampler, which takes logits and returns a sampled token
// sampling can be done in a few ways: greedy argmax, sampling, top-p samplingtypedef struct {float prob;int index;
} ProbIndex; // struct used when sorting probabilities during top-p samplingtypedef struct {int vocab_size;ProbIndex* probindex; // buffer used in top-p samplingfloat temperature;float topp;unsigned long long rng_state;
} Sampler;int sample_argmax(float* probabilities, int n) {// return the index that has the highest probabilityint max_i = 0;float max_p = probabilities[0];for (int i = 1; i < n; i++) {if (probabilities[i] > max_p) {max_i = i;max_p = probabilities[i];}}return max_i;
}int sample_mult(float* probabilities, int n, float coin) {// sample index from probabilities (they must sum to 1!)// coin is a random number in [0, 1), usually from random_f32()float cdf = 0.0f;for (int i = 0; i < n; i++) {cdf += probabilities[i];if (coin < cdf) {return i;}}return n - 1; // in case of rounding errors
}int compare(const void* a, const void* b) {ProbIndex* a_ = (ProbIndex*) a;ProbIndex* b_ = (ProbIndex*) b;if (a_->prob > b_->prob) return -1;if (a_->prob < b_->prob) return 1;return 0;
}int sample_topp(float* probabilities, int n, float topp, ProbIndex* probindex, float coin) {// top-p sampling (or "nucleus sampling") samples from the smallest set of// tokens that exceed probability topp. This way we never sample tokens that// have very low probabilities and are less likely to go "off the rails".// coin is a random number in [0, 1), usually from random_f32()int n0 = 0;// quicksort indices in descending order of probabilities// values smaller than (1 - topp) / (n - 1) cannot be part of the result// so for efficiency we crop these out as candidates before sortingconst float cutoff = (1.0f - topp) / (n - 1);for (int i = 0; i < n; i++) {if (probabilities[i] >= cutoff) {probindex[n0].index = i;probindex[n0].prob = probabilities[i];n0++;}}qsort(probindex, n0, sizeof(ProbIndex), compare);// truncate the list where cumulative probability exceeds toppfloat cumulative_prob = 0.0f;int last_idx = n0 - 1; // in case of rounding errors consider all elementsfor (int i = 0; i < n0; i++) {cumulative_prob += probindex[i].prob;if (cumulative_prob > topp) {last_idx = i;break; // we've exceeded topp by including last_idx}}// sample from the truncated listfloat r = coin * cumulative_prob;float cdf = 0.0f;for (int i = 0; i <= last_idx; i++) {cdf += probindex[i].prob;if (r < cdf) {return probindex[i].index;}}return probindex[last_idx].index; // in case of rounding errors
}void build_sampler(Sampler* sampler, int vocab_size, float temperature, float topp, unsigned long long rng_seed) {sampler->vocab_size = vocab_size;sampler->temperature = temperature;sampler->topp = topp;sampler->rng_state = rng_seed;// buffer only used with nucleus sampling; may not need but it's ~smallsampler->probindex = malloc(sampler->vocab_size * sizeof(ProbIndex));
}void free_sampler(Sampler* sampler) {free(sampler->probindex);
}unsigned int random_u32(unsigned long long *state) {// xorshift rng: https://en.wikipedia.org/wiki/Xorshift#xorshift.2A*state ^= *state >> 12;*state ^= *state << 25;*state ^= *state >> 27;return (*state * 0x2545F4914F6CDD1Dull) >> 32;
}
float random_f32(unsigned long long *state) { // random float32 in [0,1)return (random_u32(state) >> 8) / 16777216.0f;
}int sample(Sampler* sampler, float* logits) {// sample the token given the logits and some hyperparametersint next;if (sampler->temperature == 0.0f) {// greedy argmax sampling: take the token with the highest probabilitynext = sample_argmax(logits, sampler->vocab_size);} else {// apply the temperature to the logitsfor (int q=0; q<sampler->vocab_size; q++) { logits[q] /= sampler->temperature; }// apply softmax to the logits to get the probabilities for next tokensoftmax(logits, sampler->vocab_size);// flip a (float) coin (this is our source of entropy for sampling)float coin = random_f32(&sampler->rng_state);// we sample from this distribution to get the next tokenif (sampler->topp <= 0 || sampler->topp >= 1) {// simply sample from the predicted probability distributionnext = sample_mult(logits, sampler->vocab_size, coin);} else {// top-p (nucleus) sampling, clamping the least likely tokens to zeronext = sample_topp(logits, sampler->vocab_size, sampler->topp, sampler->probindex, coin);}}return next;
}// ----------------------------------------------------------------------------
// utilities: timelong time_in_ms() {// return time in milliseconds, for benchmarking the model speedstruct timespec time;clock_gettime(CLOCK_REALTIME, &time);return time.tv_sec * 1000 + time.tv_nsec / 1000000;
}// ----------------------------------------------------------------------------
// generation loopvoid generate(Transformer *transformer, Tokenizer *tokenizer, Sampler *sampler, char *prompt, int steps) {char *empty_prompt = "";if (prompt == NULL) { prompt = empty_prompt; }// encode the (string) prompt into tokens sequenceint num_prompt_tokens = 0;int* prompt_tokens = (int*)malloc((strlen(prompt)+3) * sizeof(int)); // +3 for '\0', ?BOS, ?EOSencode(tokenizer, prompt, 1, 0, prompt_tokens, &num_prompt_tokens);if (num_prompt_tokens < 1) {fprintf(stderr, "something is wrong, expected at least 1 prompt token\n");exit(EXIT_FAILURE);}// start the main looplong start = 0;  // used to time our code, only initialized after first iterationint next;        // will store the next token in the sequenceint token = prompt_tokens[0]; // kick off with the first token in the promptint pos = 0;     // position in the sequencewhile (pos < steps) {// forward the transformer to get logits for the next tokenfloat* logits = forward(transformer, token, pos);// advance the state machineif (pos < num_prompt_tokens - 1) {// if we are still processing the input prompt, force the next prompt tokennext = prompt_tokens[pos + 1];} else {// otherwise sample the next token from the logitsnext = sample(sampler, logits);}pos++;// data-dependent terminating condition: the BOS (=1) token delimits sequencesif (next == 1) { break; }// print the token as string, decode it with the Tokenizer objectchar* piece = decode(tokenizer, token, next);safe_printf(piece); // same as printf("%s", piece), but skips "unsafe" bytesfflush(stdout);token = next;// init the timer here because the first iteration can be slowerif (start == 0) { start = time_in_ms(); }}printf("\n");// report achieved tok/s (pos-1 because the timer starts after first iteration)if (pos > 1) {long end = time_in_ms();fprintf(stderr, "achieved tok/s: %f\n", (pos-1) / (double)(end-start)*1000);}free(prompt_tokens);
}void read_stdin(const char* guide, char* buffer, size_t bufsize) {// read a line from stdin, up to but not including \nprintf("%s", guide);if (fgets(buffer, bufsize, stdin) != NULL) {size_t len = strlen(buffer);if (len > 0 && buffer[len - 1] == '\n') {buffer[len - 1] = '\0'; // strip newline}}
}// ----------------------------------------------------------------------------
// chat loop
// I manually inspected the tokens for a few chat conversations compared to
// python reference and that seemed ok, but this was not thoroughly tested and
// is not safely implemented, it's more a proof of concept atm.void chat(Transformer *transformer, Tokenizer *tokenizer, Sampler *sampler,char *cli_user_prompt, char *cli_system_prompt, int steps) {// buffers for reading the system prompt and user prompt from stdin// you'll notice they are soomewhat haphazardly and unsafely set atmchar system_prompt[512];char user_prompt[512];char rendered_prompt[1152];int num_prompt_tokens = 0;int* prompt_tokens = (int*)malloc(1152 * sizeof(int));int user_idx;// start the main loopint8_t user_turn = 1; // user startsint next;        // will store the next token in the sequenceint token;       // stores the current token to feed into the transformerint prev_token;int pos = 0;     // position in the sequencewhile (pos < steps) {// when it is the user's turn to contribute tokens to the dialog...if (user_turn) {// get the (optional) system prompt at position 0if (pos == 0) {// at position 0, the user can also contribute a system promptif (cli_system_prompt == NULL) {// system prompt was not passed in, attempt to get it from stdinread_stdin("Enter system prompt (optional): ", system_prompt, sizeof(system_prompt));} else {// system prompt was passed in, use itstrcpy(system_prompt, cli_system_prompt);}}// get the user promptif (pos == 0 && cli_user_prompt != NULL) {// user prompt for position 0 was passed in, use itstrcpy(user_prompt, cli_user_prompt);} else {// otherwise get user prompt from stdinread_stdin("User: ", user_prompt, sizeof(user_prompt));}// render user/system prompts into the Llama 2 Chat schemaif (pos == 0 && system_prompt[0] != '\0') {char system_template[] = "[INST] <<SYS>>\n%s\n<</SYS>>\n\n%s [/INST]";sprintf(rendered_prompt, system_template, system_prompt, user_prompt);} else {char user_template[] = "[INST] %s [/INST]";sprintf(rendered_prompt, user_template, user_prompt);}// encode the rendered prompt into tokensencode(tokenizer, rendered_prompt, 1, 0, prompt_tokens, &num_prompt_tokens);user_idx = 0; // reset the user indexuser_turn = 0;printf("Assistant: ");}// determine the token to pass into the transformer nextif (user_idx < num_prompt_tokens) {// if we are still processing the input prompt, force the next prompt tokentoken = prompt_tokens[user_idx++];} else {// otherwise use the next token sampled from previous turntoken = next;}// EOS (=2) token ends the Assistant turnif (token == 2) { user_turn = 1; }// forward the transformer to get logits for the next tokenfloat* logits = forward(transformer, token, pos);next = sample(sampler, logits);pos++;if (user_idx >= num_prompt_tokens && next != 2) {// the Assistant is responding, so print its outputchar* piece = decode(tokenizer, token, next);safe_printf(piece); // same as printf("%s", piece), but skips "unsafe" bytesfflush(stdout);}if (next == 2) { printf("\n"); }}printf("\n");free(prompt_tokens);
}// ----------------------------------------------------------------------------
// CLI, include only if not testing
#ifndef TESTINGvoid error_usage() {fprintf(stderr, "Usage:   run <checkpoint> [options]\n");fprintf(stderr, "Example: run model.bin -n 256 -i \"Once upon a time\"\n");fprintf(stderr, "Options:\n");fprintf(stderr, "  -t <float>  temperature in [0,inf], default 1.0\n");fprintf(stderr, "  -p <float>  p value in top-p (nucleus) sampling in [0,1] default 0.9\n");fprintf(stderr, "  -s <int>    random seed, default time(NULL)\n");fprintf(stderr, "  -n <int>    number of steps to run for, default 256. 0 = max_seq_len\n");fprintf(stderr, "  -i <string> input prompt\n");fprintf(stderr, "  -z <string> optional path to custom tokenizer\n");fprintf(stderr, "  -m <string> mode: generate|chat, default: generate\n");fprintf(stderr, "  -y <string> (optional) system prompt in chat mode\n");exit(EXIT_FAILURE);
}int main(int argc, char *argv[]) {// default parameterschar *checkpoint_path = NULL;  // e.g. out/model.binchar *tokenizer_path = "tokenizer.bin";float temperature = 1.0f;   // 0.0 = greedy deterministic. 1.0 = original. don't set higherfloat topp = 0.9f;          // top-p in nucleus sampling. 1.0 = off. 0.9 works well, but slowerint steps = 256;            // number of steps to run forchar *prompt = NULL;        // prompt stringunsigned long long rng_seed = 0; // seed rng with time by defaultchar *mode = "generate";    // generate|chatchar *system_prompt = NULL; // the (optional) system prompt to use in chat mode// poor man's C argparse so we can override the defaults above from the command lineif (argc >= 2) { checkpoint_path = argv[1]; } else { error_usage(); }for (int i = 2; i < argc; i+=2) {// do some basic validationif (i + 1 >= argc) { error_usage(); } // must have arg after flagif (argv[i][0] != '-') { error_usage(); } // must start with dashif (strlen(argv[i]) != 2) { error_usage(); } // must be -x (one dash, one letter)// read in the argsif (argv[i][1] == 't') { temperature = atof(argv[i + 1]); }else if (argv[i][1] == 'p') { topp = atof(argv[i + 1]); }else if (argv[i][1] == 's') { rng_seed = atoi(argv[i + 1]); }else if (argv[i][1] == 'n') { steps = atoi(argv[i + 1]); }else if (argv[i][1] == 'i') { prompt = argv[i + 1]; }else if (argv[i][1] == 'z') { tokenizer_path = argv[i + 1]; }else if (argv[i][1] == 'm') { mode = argv[i + 1]; }else if (argv[i][1] == 'y') { system_prompt = argv[i + 1]; }else { error_usage(); }}// parameter validation/overridesif (rng_seed <= 0) rng_seed = (unsigned int)time(NULL);if (temperature < 0.0) temperature = 0.0;if (topp < 0.0 || 1.0 < topp) topp = 0.9;if (steps < 0) steps = 0;// build the Transformer via the model .bin fileTransformer transformer;build_transformer(&transformer, checkpoint_path);if (steps == 0 || steps > transformer.config.seq_len) steps = transformer.config.seq_len; // override to ~max length// build the Tokenizer via the tokenizer .bin fileTokenizer tokenizer;build_tokenizer(&tokenizer, tokenizer_path, transformer.config.vocab_size);// build the SamplerSampler sampler;build_sampler(&sampler, transformer.config.vocab_size, temperature, topp, rng_seed);// run!if (strcmp(mode, "generate") == 0) {generate(&transformer, &tokenizer, &sampler, prompt, steps);} else if (strcmp(mode, "chat") == 0) {chat(&transformer, &tokenizer, &sampler, prompt, system_prompt, steps);} else {fprintf(stderr, "unknown mode: %s\n", mode);error_usage();}// memory and file handles cleanupfree_sampler(&sampler);free_tokenizer(&tokenizer);free_transformer(&transformer);return 0;
}
#endif

5.1.3 生成可执行文件

gcc -o run run.c

运行后会在根目录下生成run.exe文件。

5.2 模型推理

./run out/model.bin

输出如下:

Once upon a time, there was a little girl named Lily. She loved to play in the garden and watch the insects crawl around. One day, while she was playing, she saw a butterfly flying around. “Hello butterfly, what are you doing?” Lily said, feeling grateful. “I am waiting for my mom to come back to check on her,” the butterfly replied. Lily watched as the butterfly flew away, feeling proud that she had made it happen. From that day on, she loved looking at insects and watching them grow and become happy.
achieved tok/s: 24.576110

还可以指定参数运行:

./run .\out\model.bin -i "One day, Lily met a Shoggoth"

也可以从Huggingface网站下载其他模型试试:

wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories15M.bin
./run stories15M.binwget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories42M.bin
./run stories42M.bin

备注:清华大学开源软件镜像站

6 结束语

大功告成!单卡RTX4070刚好能跑起来。
Enjoy it!

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.xdnf.cn/news/147469.html

如若内容造成侵权/违法违规/事实不符,请联系一条长河网进行投诉反馈,一经查实,立即删除!

相关文章

Java算法专栏

专栏导读 在当今这个技术日新月异的时代&#xff0c;Java算法作为软件开发的核心&#xff0c;对于提升程序性能和解决复杂问题至关重要。本“Java算法”专栏旨在帮助读者深入理解Java编程语言中的算法原理和应用&#xff0c;通过实战案例和深入分析&#xff0c;使读者能够掌握…

软媒市场新探索:软文媒体自助发布,开启自助发稿新篇章

在繁华喧嚣的软媒市场中,每一个声音都在竭力呼喊,每一个品牌都在奋力展现。而软文,作为一种温柔而坚韧的营销力量,正逐渐崭露头角。特别是软文媒体自助发布平台的出现,更是为企业提供了一个全新的、高效的自助发稿渠道。 软媒市场自助发布平台,正如其名,是一个让企业能够自主发…

【LeetCode】每日一题 2024_9_21 边积分最高的节点(哈希)

前言 每天和你一起刷 LeetCode 每日一题~ LeetCode 启动&#xff01; 题目&#xff1a;边积分最高的节点 代码与解题思路 func edgeScore(edges []int) (ans int) {// 直接维护哈希最大值即可mp : map[int]int{}for i, v : range edges {mp[v] i// 如果多个节点的 边积分 相…

【数据库】常用数据库简介

目录 &#x1f354; 常用的关系型数据库 &#x1f354; Mysql简介 &#x1f354; SQL 简介 SQL语句的分类 SQL 写法 SQL 常用的数据类型 &#x1f354; DDL语句 对数据库的操作 对数据表的操作 &#x1f354; DML语句 插入数据 insert into 修改数据 update 删除数…

Ubuntu下使用 python搭建服务实现从web端远程配置设备网口

1、通过文件配置Ubuntu设备网口 在Ubuntu工控机上&#xff0c;通过文件配置网口&#xff08;网络接口&#xff09;可以让网络配置在每次系统启动时自动生效。以下是常见的方法步骤&#xff1a; 1.1 使用 netplan 配置网口&#xff08;Ubuntu 18.04 及以上版本&#xff09; 编…

探索微软Copilot Agents:如何通过Wave 2 AI彻底改变工作方式

微软在最近的Copilot Wave 2发布会上&#xff0c;展示了一系列将彻底改变日常工作流程的新AI功能&#xff0c;尤其是 Copilot Agents&#xff0c;它们不仅仅是简单的工具&#xff0c;而是真正的工作助理&#xff0c;可以自动完成任务、提供智能分析并帮助你做出决策。这些新功能…

Day6:反转链表

题目&#xff1a;给你单链表的头节点head&#xff0c;请你反转链表&#xff0c;并返回反转后的链表。 输入&#xff1a;head[1,2,3,4,5] 输出&#xff1a;[5,4,3,2,1] public ListNode reverseList() {if (head null) {return head;}ListNode cur head.next;head.next null…

Python脚本每日自动备份MySQL数据库,无需mysqldump

编写一个Python脚本&#xff0c;每天凌晨3点开始备份 脚本具有以下特点 不需要安装mysql-client&#xff0c;并且Windows Linux都可以使用支持多个数据库连接的备份每个数据库支持多个表备份日志保存下来&#xff0c;方便第二天早上查看备份结果 首先安装需要的库 pip3 ins…

调节 PWM的占空比控制舵机的角度

一、PWM工作原理 让计数器从0数到自动重装载值&#xff0c;不停计数。计数值小于输出比较寄存器时输出一种电平&#xff0c;大于输出比较寄存器时使出另一种电平。 修改定时器时钟源的速度以及预分频器等设置&#xff0c;可以修改计数器计数的速度 再加上修改自动重装载值&…

肺结节检测系统源码分享

肺结节检测检测系统源码分享 [一条龙教学YOLOV8标注好的数据集一键训练_70全套改进创新点发刊_Web前端展示] 1.研究背景与意义 项目参考AAAI Association for the Advancement of Artificial Intelligence 项目来源AACV Association for the Advancement of Computer Visio…

python画图|图像背景颜色设置

python画图出来的默认图形背景是白色&#xff0c;有时候并不适合大家表达想要表达的意思。 因此&#xff0c;我们很有必要掌握自己设置图形背景颜色的技巧。 【1】官网教程 首先请各位看官移步官网&#xff0c;看看官网如何设置&#xff0c;下述链接可轻松到达&#xff1a; …

如何将很多个pdf拼接在一起?很多种PDF拼接的方法

如何将很多个pdf拼接在一起&#xff1f;将多个PDF文件合并不仅能够提升信息的整合性&#xff0c;还能使文件管理更加高效。想象一下&#xff0c;你需要向同事或老师提交一份综合报告&#xff0c;其中包含了多份相关资料。如果每个文件单独存在&#xff0c;查找和传输都会变得繁…

IDEA中Quarkus框架(3.13版本)开发、调试、部署、打包等

code-with-quarkus code-with-quarkus 是使用官网生成的demo项目 这个项目使用Quarkus&#xff08;使用3.13.0版本&#xff0c;该版本支持JDK21&#xff09;&#xff0c;超音速亚原子Java框架。官网地址: https://quarkus.io/. 环境要求 OS: Windows 10.0 jdk 11 maven 3.9…

2024年研赛-华为杯数模竞赛F题论文首发+论文讲解

本届研赛助攻题目 C D F三题论文均已经全部完成。后更新计划 如图所示。 免费给大家分享 三个问题的论文部分代码 2024年华为杯-研赛分享资料&#xff08;论文部分代码&#xff09;&#xff08;已更新部分代码&#xff09;&#xff1a; 链接&#xff1a;https://pan.baidu.com…

【Pyside】pycharm2024配置conda虚拟环境

知识拓展 Pycharm 是一个由 JetBrains 开发的集成开发环境&#xff08;IDE&#xff09;&#xff0c;它主要用于 Python 编程语言的开发。Pycharm 提供了代码编辑、调试、版本控制、测试等多种功能&#xff0c;以提高 Python 开发者的效率。 Pycharm 与 Python 的关系 Pycharm 是…

springboot实战学习笔记(5)(用户登录接口的主逻辑)

接着上篇博客学习。上篇博客是已经基本完成用户模块的注册接口的开发以及注册时的参数合法性校验。具体往回看了解的链接如下。 springboot实训学习笔记&#xff08;4&#xff09;(Spring Validation参数校验框架、全局异常处理器)-CSDN博客文章浏览阅读576次&#xff0c;点赞7…

Java免税商品购物商城:Spring Boot实现详解

第一章 绪论 1.1 课题开发的背景 从古至今&#xff0c;通过书本获取知识信息的方式完全被互联网络信息化&#xff0c;但是免税商品优选购物商城&#xff0c;对于购物商城工作来说&#xff0c;仍然是一项非常重要的工作。尤其是免税商品优选购物商城&#xff0c;传统人工记录模式…

【Proteus仿真】基于51单片机的L298N电机电速调节

目录 一、主要功能 二、硬件资源 三、程序编程 四、实现现象 一、主要功能 基于51单片机&#xff0c;L298N电机驱动连接电机&#xff0c;采用调节PWM占空比来控制电机速度转动。 仿真图&#xff1a; 编辑 二、硬件资源 基于KEIL5编写C代码&#xff0c;PROTEUS8.15进行…

SQL - 进阶语法(一)

1. SELECT TOP SELECT TOP Number|Percentage column1 from ... 选择所需要的数据的前多少行&#xff0c;Number表示具体数值&#xff0c;Percentage表示百分比 2. LIKE 搜索列元素的指定模式&#xff0c;匹配对应的信息进行自定义搜索 通常需要搭配通配符进行使用 %替代…

Postman cURL命令导入导出

导入cURL命令 cURL是一种用于发出HTTP请求的流行命令行工具。在测试Web应用程序或API时&#xff0c;cURL使您能够直接从命令行进行交互&#xff0c;使用API开发人员社区中常见的完善语法。如果在不同的地方有多个cURL命令&#xff0c;可以将它们导入Postman。 ​ 将cURL命令导入…