Mistral AI 又又又开源了闭源企业级模型——Mistral-Small-Instruct-2409

就在不久前,Mistral 公司在开源了 Pixtral 12B 视觉多模态大模型之后,又开源了自家的企业级小型模型 Mistral-Small-Instruct-2409 (22B),这是 Mistral AI 最新的企业级小型模型,是 Mistral Small v24.02 的升级版。该机型可根据 Mistral Research License 使用,为客户提供了灵活的选择,使其能够在翻译、摘要、情感分析和其他不需要完整通用模型的任务中,选择经济高效、快速可靠的解决方案。
在这里插入图片描述

Mistral Small 雏形采用 Mixtral-8X7B-v0.1(46.7B),这是一个具有 12B 活动参数的稀疏专家混合模型。它的推理能力更强,功能更多,可以生成和推理代码,并且是多语言的,支持英语、法语、德语、意大利语和西班牙语。

太激动人心了, Mistral 型号的性能总是出类拔萃。现在,我们在很多缝隙上都有了出色的覆盖范围

  • 8b- Llama 3.1 8b

  • 12b- Nemo 12b

  • 22b- Mistral Small

  • 27b- Gemma-2 27b

  • 35b- Command-R 35b 08-2024

  • 40-60b- GAP (我相信这里有两个新的 MOE,但我最后发现 Llamacpp 不支持它们)

  • 70b- Llama 3.1 70b

  • 103b- Command-R+ 103b

  • 123b- Mistral Large 2

  • 141b- WizardLM-2 8x22b

  • 230b- Deepseek V2/2.5

  • 405b- Llama 3.1 405b

Mistral Small v24.09 拥有 220 亿个参数,为客户提供了介于 Mistral NeMo 12B 和 Mistral Large 2 之间的便捷中间点,提供了可在各种平台和环境中部署的经济高效的解决方案。。

在这里插入图片描述
在这里插入图片描述

Mistral Small v24.09 拥有 220 亿个参数,为客户提供了介于 Mistral NeMo 12B 和 Mistral Large 2 之间的便捷中间点,提供了可在各种平台和环境中部署的经济高效的解决方案。如下图所示,与以前的模型相比,新的小型模型在人类对齐、推理能力和代码方面都有显著改进。
在这里插入图片描述
在这里插入图片描述

Mistral-Small-Instruct-2409 是一个指示微调版本,具有以下特点:

  • 22B 参数
  • 词汇量达 32768
  • 支持函数调用
  • 128k 序列长度

使用

vLLM(推荐)

安装 vLLM >= v0.6.1.post1

pip install --upgrade vllm

安装 mistral_common >= 1.4.1

pip install --upgrade mistral_common

本地

from vllm import LLM
from vllm.sampling_params import SamplingParamsmodel_name = "mistralai/Mistral-Small-Instruct-2409"sampling_params = SamplingParams(max_tokens=8192)# note that running Mistral-Small on a single GPU requires at least 44 GB of GPU RAM
# If you want to divide the GPU requirement over multiple devices, please add *e.g.* `tensor_parallel=2`
llm = LLM(model=model_name, tokenizer_mode="mistral", config_format="mistral", load_format="mistral")prompt = "How often does the letter r occur in Mistral?"messages = [{"role": "user","content": prompt},
]outputs = llm.chat(messages, sampling_params=sampling_params)print(outputs[0].outputs[0].text)

服务器

vllm serve mistralai/Mistral-Small-Instruct-2409 --tokenizer_mode mistral --config_format mistral --load_format mistral

注意: 在单 GPU 上运行 Mistral-Small 至少需要 44 GB GPU 内存。

如果要将 GPU 需求分配给多个设备,请添加 --tensor_parallel=2 等信息

客户端

curl --location 'http://<your-node-url>:8000/v1/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer token' \
--data '{"model": "mistralai/Mistral-Small-Instruct-2409","messages": [{"role": "user","content": "How often does the letter r occur in Mistral?"}]
}'

Mistral-inference

安装mistral_inference >= 1.4.1

pip install mistral_inference --upgrade

下载

from huggingface_hub import snapshot_download
from pathlib import Pathmistral_models_path = Path.home().joinpath('mistral_models', '22B-Instruct-Small')
mistral_models_path.mkdir(parents=True, exist_ok=True)snapshot_download(repo_id="mistralai/Mistral-Small-Instruct-2409", allow_patterns=["params.json", "consolidated.safetensors", "tokenizer.model.v3"], local_dir=mistral_models_path)

聊天

mistral-chat $HOME/mistral_models/22B-Instruct-Small --instruct --max_tokens 256

Instruct following

from mistral_inference.transformer import Transformer
from mistral_inference.generate import generatefrom mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.messages import UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequesttokenizer = MistralTokenizer.from_file(f"{mistral_models_path}/tokenizer.model.v3")
model = Transformer.from_folder(mistral_models_path)completion_request = ChatCompletionRequest(messages=[UserMessage(content="How often does the letter r occur in Mistral?")])tokens = tokenizer.encode_chat_completion(completion_request).tokensout_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
result = tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens[0])print(result)

Function calling

from mistral_common.protocol.instruct.tool_calls import Function, Tool
from mistral_inference.transformer import Transformer
from mistral_inference.generate import generatefrom mistral_common.tokens.tokenizers.mistral import MistralTokenizer
from mistral_common.protocol.instruct.messages import UserMessage
from mistral_common.protocol.instruct.request import ChatCompletionRequesttokenizer = MistralTokenizer.from_file(f"{mistral_models_path}/tokenizer.model.v3")
model = Transformer.from_folder(mistral_models_path)completion_request = ChatCompletionRequest(tools=[Tool(function=Function(name="get_current_weather",description="Get the current weather",parameters={"type": "object","properties": {"location": {"type": "string","description": "The city and state, e.g. San Francisco, CA",},"format": {"type": "string","enum": ["celsius", "fahrenheit"],"description": "The temperature unit to use. Infer this from the users location.",},},"required": ["location", "format"],},))],messages=[UserMessage(content="What's the weather like today in Paris?"),],
)tokens = tokenizer.encode_chat_completion(completion_request).tokensout_tokens, _ = generate([tokens], model, max_tokens=64, temperature=0.0, eos_id=tokenizer.instruct_tokenizer.tokenizer.eos_id)
result = tokenizer.instruct_tokenizer.tokenizer.decode(out_tokens[0])print(result)

Hugging Face Transformers

from transformers import LlamaTokenizerFast, MistralForCausalLM
import torchdevice = "cuda"
tokenizer = LlamaTokenizerFast.from_pretrained('mistralai/Mistral-Small-Instruct-2409')
tokenizer.pad_token = tokenizer.eos_tokenmodel = MistralForCausalLM.from_pretrained('mistralai/Mistral-Small-Instruct-2409', torch_dtype=torch.bfloat16)
model = model.to(device)prompt = "How often does the letter r occur in Mistral?"messages = [{"role": "user", "content": prompt},]model_input = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(device)
gen = model.generate(model_input, max_new_tokens=150)
dec = tokenizer.batch_decode(gen)
print(dec)

输出

<s>[INST]How often does the letter r occur in Mistral?[/INST]To determine how often the letter "r" occurs in the word "Mistral,"we can simply count the instances of "r" in the word.The word "Mistral" is broken down as follows:- M- i- s- t- r- a- lCounting the "r"s, we find that there is only one "r" in "Mistral."Therefore, the letter "r" occurs once in the word "Mistral."
</s>

看来 Mistral 尝试用 CoT 来修复草莓问题🙂

资料

https://mistral.ai/news/september-24-release/

https://artificialanalysis.ai/models/mistral-small

https://huggingface.co/mistralai/Mistral-Small-Instruct-2409

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.xdnf.cn/news/144041.html

如若内容造成侵权/违法违规/事实不符,请联系一条长河网进行投诉反馈,一经查实,立即删除!

相关文章

无人机如何突破高海拔高寒飞行环境?

无人机在突破高海拔高寒飞行环境方面&#xff0c;需要解决一系列技术难题和挑战。以下是一些主要的技术手段和策略&#xff1a; 1. 无人机平台设计与优化 增强机体结构&#xff1a;采用轻质高强度的材料&#xff0c;如碳纤维、复合材料等&#xff0c;减轻机身重量&#xff0c…

HomeAssistant显示节假日

先看效果 步骤&#xff1a; 新建卡片时选择“Markdown 卡片”代码在文章最下方&#xff0c;当然你也可以自己修改 点击保存/完成 ### {% if now().hour > 6 and now().hour < 9 -%} 早上好&#xff0c; {%- elif now().hour > 9 and now().hour < 12 -%} 上午好…

ipython里如何用?快速查阅帮助

1、&#xff1f;用于查询函数帮助文档&#xff0c;??用于查询带源码的帮助文档 ?用于搜索内容&#xff0c;*作为通配符。

javascript-原型和原型链

原型 每个函数都有一个默认的原型对象 - prototype ,通过 prototype 我们可以扩展 js 的内置对象。一个函数和它创建的实例共享这个函数的原型属性和方法。实例对象的 constructor 会指向构造函数 原型链 每个实例对象都会有一个隐式原型属性 __proto__,通过 __proto__ 指…

网络安全-shire写任务计划、反弹shell、写私钥、反序列化

目录 一、环境 二、 介绍 三、开始做题 四、写公钥 一、环境 网上自己找 二、 介绍 我们经过前面文章很清楚知道&#xff0c;shiro是将数据存储在内存当中&#xff0c;内存落盘实现一个数据存储&#xff0c;而当其结合python&#xff0c;python将登录的session存储到shiro里…

【隐私计算篇】不经意传输协议(OT/OTE)的进一步补充

1. 背景介绍 关于不经意传输(OT)和不经意传输扩展(OT Extension), 我们在之前的文章《OT&OT扩展(不经意传输扩展)深入浅出》做了详细的说明。但对于OT/OTE的一些技术或者概念&#xff0c;还有一定的内容欠缺&#xff0c;因此本文根据冯登国院士关于安全多方计算协议…

了解快充协议芯片诱骗取电过程

快充协议芯片诱骗取电的过程主要涉及充电器与设备之间的通信和电压协商&#xff0c;以确保安全、快速和高效的充电。这个过程依赖于快充协议芯片&#xff0c;如XSP08Q快充诱骗芯片&#xff0c;它们内置通信模块&#xff0c;能够与供电端的充电器进行握手通信&#xff0c;从而申…

(黑马点评)七、附近商户系列功能实现

7.1 GEO数据结构的认识及其基本使用演示 7.1.1 GEO的介绍 GEO&#xff0c;代表地理坐标。Redis在3.2版本中加入了对GEO的支持&#xff0c;允许存储地理坐标信息&#xff0c;帮助我们根据经纬度来检索数据。常见的命令有&#xff1a; GEOADD&#xff1a;添加一个地理空间信息&a…

Java创建教程!(*  ̄3)(ε ̄ *)

Java 构造函数 Java面向对象设计 - Java构造函数 构造函数是用于在对象创建后立即初始化对象的代码块。 构造函数的结构看起来类似于一个方法。 声明构造函数 构造函数声明的一般语法是 <Modifiers> <Constructor Name>(<parameters list>) throws <…

【Binlog实战】:基于Spring监听Binlog日志

【Binlog实战】&#xff1a;基于Spring监听Binlog日志 binlog的三种模式 MySQL 的二进制日志&#xff08;binlog&#xff09;有三种不同的格式&#xff0c;通常被称为 binlog 模式。这三种模式分别是 Statement 模式、Row 模式和Mixed 模式。 Statement 模式&#xff1a; 在 …

JavaWEB概述

JavaWEB概述 一、什么是JavaWEB 用Java技术解决web互联网领域的技术栈。要学习JavaWEB首先得知道什么是客户端和服务端 客户端&#xff1a;简而言之&#xff0c;这就是使用方&#xff0c;比如我们下载一个软件去使用&#xff0c;里面有很多我们可以使用的功能&#xff0c;那…

Flutter问题记录 - 适配Xcode 16和iOS 18

文章目录 前言开发环境问题及解决方案1. Upload Symbols Failed2. type UIApplication does not conform to protocol Launcher3. method does not override any method from its superclass 最后 前言 为了新的镜像功能升级了macOS 15和iOS 18&#xff0c;Xcode也不可避免的需…

传输层协议——udp/tcp

目录 再谈端口号 udp 协议 理解报头 udp特点 缓冲区 udp使用的注意事项 tcp协议 TCP的可靠性与提高效率的策略 序号/确认序号 窗口大小 ACK&#xff1a; PSH URG RST 保活机制 重传 三次握手(SYN) 四次挥手(FIN) 流量控制 滑动窗口 拥塞控制 延迟应答 捎带应答 面…

面向切面:单元测试、事务、资源操作

目录 一、单元测试二、事务2.1、概述2.1.1、编程式事务2.1.2、声明式事务 2.2、JdbcTemplate2.3、基于注解的声明式事务2.3.1、基本用例-实现注解式的声明事务2.3.2、事务属性&#xff1a;只读2.3.3、事务属性&#xff1a;超时2.3.4、事务属性&#xff1a;回滚策略2.3.5、事务属…

八戒农场小程序V2最新源码

一.介绍 八戒农场V2小程序源码&#xff0c;前端工具上传&#xff0c;包更新、这个是源码&#xff0c;覆盖即可升级版&#xff08;修复很多问题&#xff09;&#xff1b;

基于UKF(无迹卡尔曼滤波)的SINS/GPS集成导航仿真程序【需要PSINS工具箱支持】

文章目录 主要特点内容包括运行截图 基于UKF&#xff08;无迹卡尔曼滤波&#xff09;的SINS/GPS集成导航仿真程序&#xff08;需要基于PSINS工具箱&#xff0c;工具箱是开源的&#xff0c;如果需要&#xff0c;可以确认收货后找我要链接&#xff09;。该程序能够高效地模拟导航…

Python VS Golng 谁更胜一筹?

今天我们聊聊Python和Golang这俩到底谁更胜一筹。 这个话题我已经在各种技术论坛上看到无数次了&#xff0c;每次都能引起一波热烈的讨论。作为一个多年写代码的老程序员&#xff0c;今天就站在我的角度&#xff0c;和大家掰扯掰扯这两个语言各自的优缺点。 1. 性能与并发模型…

软件测试技术之 GPU 单元测试是什么!

1 背景 测试是开发的一个非常重要的方面&#xff0c;可以在很大程度上决定一个应用程序的命运。良好的测试可以在早期捕获导致应用程序崩溃的问题&#xff0c;但较差的测试往往总是导致故障和停机。 单元测试用于测试各个代码组件&#xff0c;并确保代码按照预期的方式工作。单…

力扣(LeetCode)每日一题 1184. 公交站间的距离

题目链接https://leetcode.cn/problems/distance-between-bus-stops/description/?envTypedaily-question&envId2024-09-16 环形公交路线上有 n 个站&#xff0c;按次序从 0 到 n - 1 进行编号。我们已知每一对相邻公交站之间的距离&#xff0c;distance[i] 表示编号为 i …

C语言--结构体(学习笔记)

内容借鉴于b站杜远超官方频道&#xff08;C语言结构体详解【干货】&#xff09; 首先C语言中定义变量格式为“数据类型 变量名”&#xff0c;如int a; float b;等等。 那么结构体则是将多个变量&#xff08;数据类型 变量名&#xff09;结合在一起的一种新的数据类型&…