云-本地神经搜索[?]适用于以下方面的框架任何数据类型
Jina 允许您在短短几分钟内构建以深度学习为动力的搜索即服务
🌌所有数据类型-大规模索引和查询任何类型的非结构化数据:视频、图像、长/短文本、音乐、源代码、PDF等
🌩️FAST和本机云-从第一天开始的分布式架构,可扩展且设计为本地云:享受集装箱化、流式处理、并行、分片、异步调度、HTTP/GRPC/WebSocket协议
⏱️节省时间–这个神经搜索系统的设计模式,从零到生产准备就绪的系统只需几分钟
🍱拥有您的堆栈-保持解决方案的端到端堆栈所有权,避免使用零散的、多供应商的通用旧式工具带来的集成陷阱
运行快速演示
- 👗 Fashion image search:
jina hello fashion
- 🤖 QA chatbot:
pip install "jina[chatbot]" && jina hello chatbot
- 📰 Multimodal search:
pip install "jina[multimodal]" && jina hello multimodal
- 🍴将演示源代码分叉到您的文件夹:
jina hello fork fashion ../my-proj/
安装
- 通过PyPI:
pip install -U "jina[standard]"
- 通过Docker:
docker run jinaai/jina:latest
更多安装选项
x86/64、arm64、v6、v7 | Linux/MacOS和Python 3.7/3.8/3.9 | Docker用户 |
---|---|---|
最低要求 (不支持HTTP、WebSocket、Docker) |
pip install jina |
docker run jinaai/jina:latest |
Daemon | pip install "jina[daemon]" |
docker run --network=host jinaai/jina:latest-daemon |
使用附加服务 | pip install "jina[devel]" |
docker run jinaai/jina:latest-devel |
版本标识符are explained here吉娜可以继续奔跑Windows Subsystem for Linux我们欢迎社会各界帮助我们native Windows support
开始使用
文档、执行者和流是JINA中的三个基本概念
- 📄 Document是济纳的基本数据类型;
- ⚙️ Executor是吉娜处理文件的方式;
- 🔀 Flow是JINA精简和分发执行器的方式
1个️⃣复制-粘贴下面的最小示例并运行它:
💡预赛:character embedding,pooling,Euclidean distance
import numpy as np
from jina import Document, DocumentArray, Executor, Flow, requests
class CharEmbed(Executor): # a simple character embedding with mean-pooling
offset = 32 # letter `a`
dim = 127 - offset + 1 # last pos reserved for `UNK`
char_embd = np.eye(dim) * 1 # one-hot embedding for all chars
@requests
def foo(self, docs: DocumentArray, **kwargs):
for d in docs:
r_emb = [ord(c) - self.offset if self.offset <= ord(c) <= 127 else (self.dim - 1) for c in d.text]
d.embedding = self.char_embd[r_emb, :].mean(axis=0) # average pooling
class Indexer(Executor):
_docs = DocumentArray() # for storing all documents in memory
@requests(on='/index')
def foo(self, docs: DocumentArray, **kwargs):
self._docs.extend(docs) # extend stored `docs`
@requests(on='/search')
def bar(self, docs: DocumentArray, **kwargs):
q = np.stack(docs.get_attributes('embedding')) # get all embeddings from query docs
d = np.stack(self._docs.get_attributes('embedding')) # get all embeddings from stored docs
euclidean_dist = np.linalg.norm(q[:, None, :] - d[None, :, :], axis=-1) # pairwise euclidean distance
for dist, query in zip(euclidean_dist, docs): # add & sort match
query.matches = [Document(self._docs[int(idx)], copy=True, scores={'euclid': d}) for idx, d in enumerate(dist)]
query.matches.sort(key=lambda m: m.scores['euclid'].value) # sort matches by their values
f = Flow(port_expose=12345, protocol='http', cors=True).add(uses=CharEmbed, parallel=2).add(uses=Indexer) # build a Flow, with 2 parallel CharEmbed, tho unnecessary
with f:
f.post('/index', (Document(text=t.strip()) for t in open(__file__) if t.strip())) # index all lines of _this_ file
f.block() # block for listening request
2个️⃣打开http://localhost:12345/docs
(扩展的Swagger UI)在浏览器中,单击/搜索制表符和输入:
{"data": [{"text": "@requests(on=something)"}]}
也就是说,我们希望从上面的代码片段中找到与以下内容最相似的行@request(on=something)
现在单击执行巴顿!
3个️⃣不是图形用户界面的人?那就让我们用Python来做吧!保持上述服务器运行,并启动一个简单的客户端:
from jina import Client, Document
from jina.types.request import Response
def print_matches(resp: Response): # the callback function invoked when task is done
for idx, d in enumerate(resp.docs[0].matches[:3]): # print top-3 matches
print(f'[{idx}]{d.scores["euclid"].value:2f}: "{d.text}"')
c = Client(protocol='http', port_expose=12345) # connect to localhost:12345
c.post('/search', Document(text='request(on=something)'), on_done=print_matches)
,它打印以下结果:
Client@1608[S]:connected to the gateway at localhost:12345!
[0]0.168526: "@requests(on='/index')"
[1]0.181676: "@requests(on='/search')"
[2]0.192049: "query.matches = [Document(self._docs[int(idx)], copy=True, score=d) for idx, d in enumerate(dist)]"
😔不管用吗?我们的错!Please report it here.
阅读教程
- 🧠What is “Neural Search”?
- 📄
Document
&DocumentArray
:JINA中的基本数据类型 - ⚙️
Executor
:JINA是如何处理文件的 - 🔀
Flow
:JINA如何精简和分发执行器 - 🤹Serving Jina
- 📓Developer Reference
- 🧼Clean & Efficient Coding in Jina
- 😎3 Reasons to Use Jina 2.0
支持
- 加入我们的Slack community与我们的工程师讨论您的使用情形、问题和支持查询
- 加入我们的Engineering All Hands会面,讨论您的用例并了解JINA的新功能
- 什么时候?每个月的第二个星期二
- 哪里?缩放(see our public events calendar/.ical)和live stream on YouTube
- 在我们的网站上订阅最新的视频教程YouTube channel
加入我们吧
吉娜的后盾是Jina AIWe are actively hiring全栈开发人员、解决方案工程师在开源领域构建下一个神经搜索生态系统
贡献
我们欢迎来自开源社区、个人和合作伙伴的各种贡献。我们的成功归功于你的积极参与