英伟达基于Mistral 7B开发新一代Embedding模型——NV-Embed-v2

在这里插入图片描述

我们介绍的 NV-Embed-v2 是一种通用嵌入模型,它在大规模文本嵌入基准(MTEB 基准)(截至 2024 年 8 月 30 日)的 56 项文本嵌入任务中以 72.31 的高分排名第一。此外,它还在检索子类别中排名第一(在 15 项任务中获得 62.65 分),这对 RAG 技术的发展至关重要。

NV-Embed-v2 采用了多项新设计,包括让 LLM 关注潜在向量,以获得更好的池化嵌入输出,并展示了一种两阶段指令调整方法,以提高检索和非检索任务的准确性。此外,NV-Embed-v2 还采用了一种新颖的硬阴性挖掘方法,该方法考虑了正相关性得分,能更好地去除假阴性。

有关更多技术细节,请参阅我们的论文: NV-Embed:将 LLM 训练为通用嵌入模型的改进技术。

型号详情

  • 仅用于解码器的基本 LLM:Mistral-7B-v0.1
  • 池类型: Latent-Attention
  • 嵌入尺寸: 4096

如何使用

所需软件包

如果遇到问题,请尝试安装以下 python 软件包

pip uninstall -y transformer-engine
pip install torch==2.2.0
pip install transformers==4.42.4
pip install flash-attn==2.2.0
pip install sentence-transformers==2.7.0

以下是如何使用 Huggingface-transformer 和 Sentence-transformer 对查询和段落进行编码的示例。

HuggingFace Transformers

import torch
import torch.nn.functional as F
from transformers import AutoTokenizer, AutoModel# Each query needs to be accompanied by an corresponding instruction describing the task.
task_name_to_instruct = {"example": "Given a question, retrieve passages that answer the question",}query_prefix = "Instruct: "+task_name_to_instruct["example"]+"\nQuery: "
queries = ['are judo throws allowed in wrestling?', 'how to become a radiology technician in michigan?']# No instruction needed for retrieval passages
passage_prefix = ""
passages = ["Since you're reading this, you are probably someone from a judo background or someone who is just wondering how judo techniques can be applied under wrestling rules. So without further ado, let's get to the question. Are Judo throws allowed in wrestling? Yes, judo throws are allowed in freestyle and folkstyle wrestling. You only need to be careful to follow the slam rules when executing judo throws. In wrestling, a slam is lifting and returning an opponent to the mat with unnecessary force.","Below are the basic steps to becoming a radiologic technologist in Michigan:Earn a high school diploma. As with most careers in health care, a high school education is the first step to finding entry-level employment. Taking classes in math and science, such as anatomy, biology, chemistry, physiology, and physics, can help prepare students for their college studies and future careers.Earn an associate degree. Entry-level radiologic positions typically require at least an Associate of Applied Science. Before enrolling in one of these degree programs, students should make sure it has been properly accredited by the Joint Review Committee on Education in Radiologic Technology (JRCERT).Get licensed or certified in the state of Michigan."
]# load model with tokenizer
model = AutoModel.from_pretrained('nvidia/NV-Embed-v2', trust_remote_code=True)# get the embeddings
max_length = 32768
query_embeddings = model.encode(queries, instruction=query_prefix, max_length=max_length)
passage_embeddings = model.encode(passages, instruction=passage_prefix, max_length=max_length)# normalize embeddings
query_embeddings = F.normalize(query_embeddings, p=2, dim=1)
passage_embeddings = F.normalize(passage_embeddings, p=2, dim=1)# get the embeddings with DataLoader (spliting the datasets into multiple mini-batches)
# batch_size=2
# query_embeddings = model._do_encode(queries, batch_size=batch_size, instruction=query_prefix, max_length=max_length, num_workers=32, return_numpy=True)
# passage_embeddings = model._do_encode(passages, batch_size=batch_size, instruction=passage_prefix, max_length=max_length, num_workers=32, return_numpy=True)scores = (query_embeddings @ passage_embeddings.T) * 100
print(scores.tolist())
# [[87.42693328857422, 0.46283677220344543], [0.965264618396759, 86.03721618652344]]

Sentence-Transformers

import torch
from sentence_transformers import SentenceTransformer# Each query needs to be accompanied by an corresponding instruction describing the task.
task_name_to_instruct = {"example": "Given a question, retrieve passages that answer the question",}query_prefix = "Instruct: "+task_name_to_instruct["example"]+"\nQuery: "
queries = ['are judo throws allowed in wrestling?', 'how to become a radiology technician in michigan?']# No instruction needed for retrieval passages
passages = ["Since you're reading this, you are probably someone from a judo background or someone who is just wondering how judo techniques can be applied under wrestling rules. So without further ado, let's get to the question. Are Judo throws allowed in wrestling? Yes, judo throws are allowed in freestyle and folkstyle wrestling. You only need to be careful to follow the slam rules when executing judo throws. In wrestling, a slam is lifting and returning an opponent to the mat with unnecessary force.","Below are the basic steps to becoming a radiologic technologist in Michigan:Earn a high school diploma. As with most careers in health care, a high school education is the first step to finding entry-level employment. Taking classes in math and science, such as anatomy, biology, chemistry, physiology, and physics, can help prepare students for their college studies and future careers.Earn an associate degree. Entry-level radiologic positions typically require at least an Associate of Applied Science. Before enrolling in one of these degree programs, students should make sure it has been properly accredited by the Joint Review Committee on Education in Radiologic Technology (JRCERT).Get licensed or certified in the state of Michigan."
]# load model with tokenizer
model = SentenceTransformer('nvidia/NV-Embed-v2', trust_remote_code=True)
model.max_seq_length = 32768
model.tokenizer.padding_side="right"def add_eos(input_examples):input_examples = [input_example + model.tokenizer.eos_token for input_example in input_examples]return input_examples# get the embeddings
batch_size = 2
query_embeddings = model.encode(add_eos(queries), batch_size=batch_size, prompt=query_prefix, normalize_embeddings=True)
passage_embeddings = model.encode(add_eos(passages), batch_size=batch_size, normalize_embeddings=True)scores = (query_embeddings @ passage_embeddings.T) * 100
print(scores.tolist())

MTEB 基准的指令模板

对于检索、STS 和摘要的 MTEB 子任务,请使用 instructions.json 中的指令前缀模板。 对于分类、聚类和重排,请使用 NV-Embed 论文表 7 中提供的说明。 7 中提供的说明。

instructions.json

{"ClimateFEVER":{"query": "Given a claim about climate change, retrieve documents that support or refute the claim","corpus": ""},"HotpotQA":{"query": "Given a multi-hop question, retrieve documents that can help answer the question","corpus": ""},"FEVER":{"query": "Given a claim, retrieve documents that support or refute the claim","corpus": ""},"MSMARCO":{"query": "Given a web search query, retrieve relevant passages that answer the query","corpus": ""},"DBPedia":{"query": "Given a query, retrieve relevant entity descriptions from DBPedia","corpus": ""},"NQ":{"query": "Given a question, retrieve passages that answer the question","corpus": ""},"QuoraRetrieval":{"query": "Given a question, retrieve questions that are semantically equivalent to the given question","corpus": "Given a question, retrieve questions that are semantically equivalent to the given question"},"SCIDOCS":{"query": "Given a scientific paper title, retrieve paper abstracts that are cited by the given paper","corpus": ""},"TRECCOVID":{"query": "Given a query on COVID-19, retrieve documents that answer the query","corpus": ""},"Touche2020":{"query": "Given a question, retrieve passages that answer the question","corpus": ""},"SciFact":{"query": "Given a scientific claim, retrieve documents that support or refute the claim","corpus": ""},"NFCorpus":{"query": "Given a question, retrieve relevant documents that answer the question","corpus": ""},"ArguAna":{"query": "Given a claim, retrieve documents that support or refute the claim","corpus": ""},"FiQA2018":{"query": "Given a financial question, retrieve relevant passages that answer the query","corpus": ""},"STS":{"text": "Retrieve semantically similar text"},"SUMM":{"text": "Given a news summary, retrieve other semantically similar summaries"}
}

如何启用多 GPU(注意,这是 HuggingFace Transformers的情况)

from transformers import AutoModel
from torch.nn import DataParallelembedding_model = AutoModel.from_pretrained("nvidia/NV-Embed-v2")
for module_key, module in embedding_model._modules.items():embedding_model._modules[module_key] = DataParallel(module)

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.xdnf.cn/news/16711.html

如若内容造成侵权/违法违规/事实不符,请联系一条长河网进行投诉反馈,一经查实,立即删除!

相关文章

Flume1.9.0自定义拦截器

需求 1、在linux日志文件/data/log/moreInfoRes.log中一直会产生如下JSON数据: {"id":"14943445328940974601","uid":"840717325115457536","lat":"53.530598","lnt":"-2.5620373&qu…

RadSystems 自定义页面全攻略:个性化任务管理系统的实战设计

系列文章目录 探索RadSystems:低代码开发的新选择(一)🚪 探索RadSystems:低代码开发的新选择(二)🚪 探索RadSystems:低代码开发的新选择(三)&…

【开发基础】语义化版本控制

语义化版本控制 基础三级结构主版本号次版本号修正版本号 思维导图在node包管理中的特殊规则 参考文件 基础 语义化版本控制是一套通用的包/库的版本管理规范。在各类语言的包管理中都有用到,一般以x.x.x的形式出现在包的命名中。 三级结构 在语义化版本控制中&a…

IDC 报告:百度智能云 VectorDB 优势数量 TOP 1

近日,IDC 发布了《RAG 与向量数据库市场前景预测》报告,深入剖析了检索增强生成(RAG)技术和向量数据库市场的发展趋势。报告不仅绘制了 RAG 技术的发展蓝图,还评估了市场上的主要厂商。在这一评估中,百度智…

操作系统——同步

笔记内容及图片整理自XJTUSE “操作系统” 课程ppt,仅供学习交流使用,谢谢。 背景 解决有界缓冲区问题的共享内存方法在并发变量上存在竞争条件,即多个并发进程访问和操作同一个共享数据,从而其执行结果与特定访问次序有关。这种…

IAR调试时输出文本数据到电脑(未使用串口)

说明 因为板子没引出串口引脚,没法接USB转TTL,又想要长时间运行程序并保存某些特定数据,所以找到了这个办法。为了写数据到本机,所以板子必须运行在IAR的debug模式下。 参考:IAR环境下变量导出方法:https…

基于Java Springboot美食食谱推荐系统

一、作品包含 源码数据库设计文档万字PPT全套环境和工具资源部署教程 二、项目技术 前端技术:Html、Css、Js、Vue、Element-ui 数据库:MySQL 后端技术:Java、Spring Boot、MyBatis 三、运行环境 开发工具:IDEA 数据库&…

Java集合(Collection+Map)

Java集合&#xff08;CollectionMap&#xff09; 为什么要使用集合&#xff1f;泛型 <>集合框架单列集合CollectionCollection遍历方式List&#xff1a;有序、可重复、有索引ArrayListLinkedListVector&#xff08;已经淘汰&#xff0c;不会再用&#xff09; Set&#xf…

vue 项目使用 nginx 部署

前言 记录下使用element-admin-template 改造项目踩过的坑及打包部署过程 一、根据权限增加动态路由不生效 原因是Sidebar中路由取的 this.$router.options.routes,需要在计算路由 permission.js 增加如下代码 // generate accessible routes map based on roles const acce…

vue3 中直接使用 JSX ( lang=“tsx“ 的用法)

1. 安装依赖 npm i vitejs/plugin-vue-jsx2. 添加配置 vite.config.ts 中 import vueJsx from vitejs/plugin-vue-jsxplugins 中添加 vueJsx()3. 页面使用 <!-- 注意 lang 的值为 tsx --> <script setup lang"tsx"> const isDark ref(false)// 此处…

uniapp如何i18n国际化

1、正常情况下项目在代码生成的时候就已经有i18n的相关依赖&#xff0c;如果没有可以自行使用如下命令下载&#xff1a; npm install vue-i18n --save 2、创建相关文件 en文件下&#xff1a; zh文件下&#xff1a; index文件下&#xff1a; 3、在main.js中注册&#xff1a…

【视觉SLAM】4b-特征点法估计相机运动之PnP 3D-2D

文章目录 1 问题引入2 求解P3P 1 问题引入 透视n点&#xff08;Perspective-n-Point&#xff0c;PnP&#xff09;问题是计算机视觉领域的经典问题&#xff0c;用于求解3D-2D的点运动。换句话说&#xff0c;当知道n个3D空间点坐标以及它们在图像上的投影点坐标时&#xff0c;可…

wsl2配置文件.wslconfig不生效

问题 今天在使用wsl时&#xff0c;通过以下配置关闭swap内存&#xff0c;但是发现重启虚拟机之后也不会生效。 [wsl2] swap0 memory16GB后来在微软官方文档里看到&#xff0c;只有wsl2才支持通过.wslconfig文件配置&#xff0c;于是通过wsl -l -v查看当前wsl版本&#xff0c;…

借助Excel实现Word表格快速排序

实例需求&#xff1a;Word中的表格如下图所示&#xff0c;为了强化记忆&#xff0c;希望能够将表格内容随机排序&#xff0c;表格第一列仍然按照顺序编号&#xff0c;即编号不跟随表格行内容调整。 乱序之后的效果如下图所示&#xff08;每次运行代码的结果都不一定相同&#x…

系统架构设计师:系统架构设计基础知识

从第一个程序被划分成模块开始&#xff0c;软件系统就有了架构。 现在&#xff0c;有效的软件架构及其明确的描述和设计&#xff0c;已经成为软件工程领域中重要的主题。 由于不同人对Software Architecture (简称SA) 的翻译不尽相同&#xff0c;企业界喜欢叫”软件架构“&am…

T6识别好莱坞明星

&#x1f368; 本文为&#x1f517;365天深度学习训练营 中的学习记录博客&#x1f356; 原作者&#xff1a;K同学啊 导入基础的包 from tensorflow import keras from tensorflow.keras import layers,models import os, PIL, pathlib import matplotlib.pyplot as pl…

MybatisPlus的基础使用

提示&#xff1a;文章写完后&#xff0c;目录可以自动生成&#xff0c;如何生成可参考右边的帮助文档 文章目录 前言1、基础crud增加insert()方法&#xff1a; 删除修改查询 2、分页查询配置分页拦截器使用分页查询功能开启MP日志在yml配置文件中配置日志查看日志 3、条件查询条…

基于stm32的智能变频电冰箱系统

基于stm32的智能变频电冰箱系统 持续更新&#xff0c;欢迎关注!!! 基于stm32的智能变频电冰箱系统 随着集成电路技术的发展&#xff0c;单片微型计算机的功能也不断增强&#xff0c;许多高性能的新型机种不断涌现出来。单片机以其功能强、体积小、可靠性高、造价低和开发周期短…

【提高篇】3.3 GPIO(三,工作模式详解 上)

目录 一,工作模式介绍 二,输入浮空 三,输入上拉 一,工作模式介绍 GPIO有八种工作模式,参考下面列表,我们先有一个简单的认识。 二,输入浮空 在输入浮空模式下,上拉/下拉电阻为断开状态,施密特触发器打开,输出被禁止。输入浮空模式下,IO口的电平完全是由外部电路…

代码训练营 day66|Floyd 算法、A * 算法、最短路算法总结

前言 这里记录一下陈菜菜的刷题记录&#xff0c;主要应对25秋招、春招 个人背景 211CS本CUHK计算机相关硕&#xff0c;一年车企软件开发经验 代码能力&#xff1a;有待提高 常用语言&#xff1a;C 系列文章目录 第66天 &#xff1a;第十一章&#xff1a;图论part11 文章目录…