Skip to content

Commit

Permalink
Generate en docs
Browse files Browse the repository at this point in the history
  • Loading branch information
Milvus-doc-bot authored and Milvus-doc-bot committed Jan 13, 2025
1 parent 09869fe commit d11f5b2
Show file tree
Hide file tree
Showing 15 changed files with 371 additions and 40 deletions.
2 changes: 1 addition & 1 deletion localization/v2.4.x/site/en/adminGuide/monitor/alert.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ summary: Learn how to create an alert for Milvus services in Grafana.
d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"
></path>
</svg>
</button></h2><p>This guide takes the example of creating an alert for the memory usage of Milvus components. To create other types of alerts, please adjust your commands accordingly. If you encounter any problems during the process, feel free to ask in the <a href="https://discuss.milvus.io/">Milvus forum</a> or initiate a discussion on <a href="https://join.slack.com/t/milvusio/shared_invite/zt-e0u4qu3k-bI2GDNys3ZqX1YCJ9OM~GQ">Slack</a>.</p>
</button></h2><p>This guide takes the example of creating an alert for the memory usage of Milvus components. To create other types of alerts, please adjust your commands accordingly. If you encounter any problems during the process, feel free to ask in <a href="https://github.com/milvus-io/milvus/discussions">Github discussions</a> or start a thread on <a href="https://discord.com/invite/8uyFbECzPX">Discord</a>.</p>
<h3 id="Prerequisites" class="common-anchor-header">Prerequisites</h3><p>This tutorial assumes that you have Grafana installed and configured. If not, we recommend reading the <a href="/docs/monitor.md">monitoring guide</a>.</p>
<h3 id="1-Add-a-new-query" class="common-anchor-header">1. Add a new query</h3><p>To add an alert for the memory usage of Milvus components, edit the Memory panel. Then, add a new query with the metric: <code translate="no">process_resident_memory_bytes{app_kubernetes_io_name=&quot;milvus&quot;, app_kubernetes_io_instance=~&quot;my-release&quot;, namespace=&quot;default&quot;}</code></p>
<p>
Expand Down
2 changes: 1 addition & 1 deletion localization/v2.4.x/site/en/faq/operational_faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,5 +102,5 @@ This prioritization can result in fewer unique results than the limit if your co
<h4 id="Still-have-questions" class="common-anchor-header">Still have questions?</h4><p>You can:</p>
<ul>
<li>Check out <a href="https://github.com/milvus-io/milvus/issues">Milvus</a> on GitHub. Feel free to ask questions, share ideas, and help others.</li>
<li>Join our <a href="https://discuss.milvus.io/">Milvus Forum</a> or <a href="https://join.slack.com/t/milvusio/shared_invite/enQtNzY1OTQ0NDI3NjMzLWNmYmM1NmNjOTQ5MGI5NDhhYmRhMGU5M2NhNzhhMDMzY2MzNDdlYjM5ODQ5MmE3ODFlYzU3YjJkNmVlNDQ2ZTk">Slack Channel</a> to find support and engage with our open-source community.</li>
<li>Join our <a href="https://discord.com/invite/8uyFbECzPX">Discord Server</a> to find support and engage with our open-source community.</li>
</ul>
2 changes: 1 addition & 1 deletion localization/v2.4.x/site/en/faq/performance_faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,5 +62,5 @@ title: Performance FAQ
<h4 id="Still-have-questions" class="common-anchor-header">Still have questions?</h4><p>You can:</p>
<ul>
<li>Check out <a href="https://github.com/milvus-io/milvus/issues">Milvus</a> on GitHub. Feel free to ask questions, share ideas, and help others.</li>
<li>Join our <a href="https://join.slack.com/t/milvusio/shared_invite/enQtNzY1OTQ0NDI3NjMzLWNmYmM1NmNjOTQ5MGI5NDhhYmRhMGU5M2NhNzhhMDMzY2MzNDdlYjM5ODQ5MmE3ODFlYzU3YjJkNmVlNDQ2ZTk">Slack Channel</a> to find support and engage with our open-source community.</li>
<li>Join our <a href="https://discord.com/invite/8uyFbECzPX">Discord Server</a> to find support and engage with our open-source community.</li>
</ul>
2 changes: 1 addition & 1 deletion localization/v2.4.x/site/en/faq/product_faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,5 +78,5 @@ title: Product FAQ
<h4 id="Still-have-questions" class="common-anchor-header">Still have questions?</h4><p>You can:</p>
<ul>
<li>Check out <a href="https://github.com/milvus-io/milvus/issues">Milvus</a> on GitHub. You’re welcome to raise questions, share ideas, and help others.</li>
<li>Join our <a href="https://slack.milvus.io/">Slack community</a> to find support and engage with our open-source community.</li>
<li>Join our <a href="https://discord.com/invite/8uyFbECzPX">Discord Server</a> to find support and engage with our open-source community.</li>
</ul>
2 changes: 1 addition & 1 deletion localization/v2.4.x/site/en/faq/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,6 @@ kubectl kubectl scale sts &lt;etcd-sts&gt; --replicas=<span class="hljs-number">
<p><br/></p>
<p>If you need help solving a problem, feel free to:</p>
<ul>
<li>Join our <a href="https://join.slack.com/t/milvusio/shared_invite/enQtNzY1OTQ0NDI3NjMzLWNmYmM1NmNjOTQ5MGI5NDhhYmRhMGU5M2NhNzhhMDMzY2MzNDdlYjM5ODQ5MmE3ODFlYzU3YjJkNmVlNDQ2ZTk">Slack channel</a> and reach out for support from the Milvus team.</li>
<li>Join our <a href="https://discord.com/invite/8uyFbECzPX">Discord Server</a> and reach out for support from the Milvus team.</li>
<li><a href="https://github.com/milvus-io/milvus/issues/new/choose">File an Issue</a> on GitHub that includes details about your problem.</li>
</ul>
Original file line number Diff line number Diff line change
@@ -1 +1 @@
{"codeList":[],"headingContent":"Data Processing","anchorList":[{"label":"Data Processing","href":"Data-Processing","type":1,"isActive":false},{"label":"Data insertion","href":"Data-insertion","type":2,"isActive":false},{"label":"Index building","href":"Index-building","type":2,"isActive":false},{"label":"Data query","href":"Data-query","type":2,"isActive":false},{"label":"What's next","href":"Whats-next","type":2,"isActive":false}]}
{"codeList":[],"headingContent":"Data processing","anchorList":[{"label":"Data processing","href":"Data-processing","type":1,"isActive":false},{"label":"Data insertion","href":"Data-insertion","type":2,"isActive":false},{"label":"Index building","href":"Index-building","type":2,"isActive":false},{"label":"Data query","href":"Data-query","type":2,"isActive":false},{"label":"What's next","href":"Whats-next","type":2,"isActive":false}]}
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ id: data_processing.md
summary: Learn about the data processing procedure in Milvus.
title: Data Processing
---
<h1 id="Data-Processing" class="common-anchor-header">Data Processing<button data-href="#Data-Processing" class="anchor-icon" translate="no">
<h1 id="Data-processing" class="common-anchor-header">Data processing<button data-href="#Data-processing" class="anchor-icon" translate="no">
<svg translate="no"
aria-hidden="true"
focusable="false"
Expand All @@ -18,7 +18,7 @@ title: Data Processing
d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"
></path>
</svg>
</button></h1><p>This article provides a detailed description of the implementation of data insertion, index building, and data query in Milvus.</p>
</button></h1><p>This article provides a detailed description of the implementation of data insertion, index building, and data querying in Milvus.</p>
<h2 id="Data-insertion" class="common-anchor-header">Data insertion<button data-href="#Data-insertion" class="anchor-icon" translate="no">
<svg translate="no"
aria-hidden="true"
Expand All @@ -35,7 +35,7 @@ title: Data Processing
></path>
</svg>
</button></h2><p>You can specify a number of shards for each collection in Milvus, each shard corresponding to a virtual channel (<em>vchannel</em>). As the following figure shows, Milvus assigns each vchannel in the log broker a physical channel (<em>pchannel</em>). Any incoming insert/delete request is routed to shards based on the hash value of primary key.</p>
<p>Validation of DML requests is moved forward to proxy because Milvus does not have complicated transactions. Proxy would request a timestamp for each insert/delete request from TSO (Timestamp Oracle), which is the timing module that colocates with the root coordinator. With the older timestamp being overwritten by the newer one, timestamps are used to determine the sequence of data requests being processed. Proxy retrieves information in batches from data coord including entities’ segments and primary keys to increase overall throughput and avoid overburdening the central node.</p>
<p>Validation of DML requests is moved forward to the proxy because Milvus does not have complicated transactions. The proxy requests a timestamp for each insert/delete request from the TSO (Timestamp Oracle), which is the timing module that colocates with the root coordinator. With the older timestamp being overwritten by the newer one, timestamps are used to determine the sequence of data requests being processed. The proxy retrieves information in batches from data coord including entities’ segments and primary keys to increase overall throughput and avoid overburdening the central node.</p>
<p>
<span class="img-wrapper">
<img translate="no" src="/docs/v2.4.x/assets/channels_1.jpg" alt="Channels 1" class="doc-image" id="channels-1" />
Expand All @@ -49,14 +49,14 @@ title: Data Processing
<span>Channels 2</span>
</span>
</p>
<p><em>Vchannels</em> are maintained in the underlying log broker nodes. Each channel is physically indivisible and available for any but only one node. When data ingestion rate reaches bottleneck, consider two things: Whether the log broker node is overloaded and needs to be scaled, and whether there are sufficient shards to ensure load balance for each node.</p>
<p><em>Vchannels</em> are maintained in the underlying log broker nodes. Each channel is physically indivisible and available for any but only one node. When data ingestion rate reaches a bottleneck, consider two things: whether the log broker node is overloaded and needs to be scaled; and, whether there are sufficient shards to ensure load balancing for each node.</p>
<p>
<span class="img-wrapper">
<img translate="no" src="/docs/v2.4.x/assets/write_log_sequence.jpg" alt="Write log sequence" class="doc-image" id="write-log-sequence" />
<span>Write log sequence</span>
</span>
</p>
<p>The above diagram encapsulates four components involved in the process of writing log sequence: proxy, log broker, data node, and object storage. The process involves four tasks: validation of DML requests, publication-subscription of log sequence, conversion from streaming log to log snapshots, and persistence of log snapshots. The four tasks are decoupled from each other to make sure each task is handled by its corresponding node type. Nodes of the same type are made equal and can be scaled elastically and independently to accommodate various data loads, massive and highly fluctuating streaming data in particular.</p>
<p>The above diagram encapsulates four components involved in the process of writing log sequence: the proxy, log broker, data node, and object storage. The process involves four tasks: validation of DML requests, publication-subscription of the log sequence, conversion from a streaming log to log snapshots, and persistence of log snapshots. The four tasks are decoupled from each other to make sure each task is handled by its corresponding node type. Nodes of the same type are made equal and can be scaled elastically and independently to accommodate various data loads, massive and highly fluctuating streaming data in particular.</p>
<h2 id="Index-building" class="common-anchor-header">Index building<button data-href="#Index-building" class="anchor-icon" translate="no">
<svg translate="no"
aria-hidden="true"
Expand All @@ -72,17 +72,17 @@ title: Data Processing
d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"
></path>
</svg>
</button></h2><p>Index building is performed by index node. To avoid frequent index building for data updates, a collection in Milvus is divided further into segments, each with its own index.</p>
</button></h2><p>Index building is performed by index nodes. To avoid frequent index building for data updates, a collection in Milvus is divided further into segments, each with its own index.</p>
<p>
<span class="img-wrapper">
<img translate="no" src="/docs/v2.4.x/assets/index_building.jpg" alt="Index building" class="doc-image" id="index-building" />
<span>Index building</span>
</span>
</p>
<p>Milvus supports building index for each vector field, scalar field and primary field. Both the input and output of index building engage with object storage: The index node loads the log snapshots to index from a segment (which is in object storage) to memory, deserializes the corresponding data and metadata to build index, serializes the index when index building completes, and writes it back to object storage.</p>
<p>Index building mainly involves vector and matrix operations and hence is computation- and memory-intensive. Vectors cannot be efficiently indexed with traditional tree-based indexes due to their high-dimensional nature, but can be indexed with techniques that are more mature in this subject, such as cluster- or graph-based indexes. Regardless its type, building index involves massive iterative calculations for large-scale vectors, such as Kmeans or graph traverse.</p>
<p>Unlike indexing for scalar data, building vector index has to take full advantage of SIMD (single instruction, multiple data) acceleration. Milvus has innate support for SIMD instruction sets, e.g., SSE, AVX2, and AVX512. Given the “hiccup” and resource-intensive nature of vector index building, elasticity becomes crucially important to Milvus in economical terms. Future Milvus releases will further explorations in heterogeneous computing and serverless computation to bring down the related costs.</p>
<p>Besides, Milvus also supports scalar filtering and primary field query. It has inbuilt indexes to improve query efficiency, e.g., Bloom filter indexes, hash indexes, tree-based indexes, and inverted indexes, and plans to introduce more external indexes, e.g., bitmap indexes and rough indexes.</p>
<p>Milvus supports building indices for each vector field, scalar field and primary field. Both the input and output of index building engage with object storage: The index node loads the log snapshots to index from a segment (which is in object storage) to memory, deserializes the corresponding data and metadata to build index, serializes the index when index building completes, and writes it back to object storage.</p>
<p>Index building mainly involves vector and matrix operations and hence is compute- and memory-intensive. Vectors cannot be efficiently indexed with traditional tree-based indexes due to their high-dimensional nature, but can be indexed with techniques especially designed for this task, such as cluster- or graph-based indices. Regardless its type, building an index involves massive iterative calculations for large-scale vectors, such as K-means or graph traversal.</p>
<p>Unlike indexing for scalar data, building a vector index benefits greatly from SIMD (single instruction, multiple data) acceleration. Milvus has innate support for SIMD instruction sets, e.g., SSE, AVX2, and AVX512. Given the “hiccup” and resource-intensive nature of vector index building, elasticity becomes crucially important to Milvus in economical terms. Future Milvus releases will further explorations in heterogeneous computing and serverless computation to bring down the related costs.</p>
<p>Milvus also supports scalar filtering and primary field query. It has inbuilt indices to improve query efficiency, e.g., Bloom filter indices, hash indices, tree-based indices, and inverted indices, and plans to introduce more external indexes, e.g., bitmap indexes and rough indexes.</p>
<h2 id="Data-query" class="common-anchor-header">Data query<button data-href="#Data-query" class="anchor-icon" translate="no">
<svg translate="no"
aria-hidden="true"
Expand All @@ -98,15 +98,15 @@ title: Data Processing
d="M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z"
></path>
</svg>
</button></h2><p>Data query refers to the process of searching a specified collection for <em>k</em> number of vectors nearest to a target vector or for <em>all</em> vectors within a specified distance range to the vector. Vectors are returned together with their corresponding primary key and fields.</p>
</button></h2><p>The term “data query refers to the process of searching a specified collection for <em>k</em> number of vectors nearest to a target vector or for <em>all</em> vectors within a specified distance range to the vector. Vectors are returned together with their corresponding primary key and fields.</p>
<p>
<span class="img-wrapper">
<img translate="no" src="/docs/v2.4.x/assets/data_query.jpg" alt="Data query" class="doc-image" id="data-query" />
<span>Data query</span>
</span>
</p>
<p>A collection in Milvus is split into multiple segments, and the query nodes loads indexes by segment. When a search request arrives, it is broadcast to all query nodes for a concurrent search. Each node then prunes the local segments, searches for vectors meeting the criteria, and reduces and returns the search results.</p>
<p>Query nodes are independent from each other in a data query. Each node is responsible only for two tasks: Load or release segments following the instructions from query coord; conduct a search within the local segments. And proxy is responsible for reducing search results from each query node and returning the final results to the client.</p>
<p>A collection in Milvus is split into multiple segments, and the query nodes load indices by segment. When a search request arrives, it is broadcast to all query nodes for a concurrent search. Each node then prunes the local segments, searches for vectors meeting the criteria, and reduces and returns the search results.</p>
<p>Query nodes are independent from each other in a data query. Each node is responsible only for two tasks: Loading or releasing segments following the instructions from query coord; conducting a search within the local segments. And the proxy is responsible for reducing search results from each query node and returning the final results to the client.</p>
<p>
<span class="img-wrapper">
<img translate="no" src="/docs/v2.4.x/assets/handoff.jpg" alt="Handoff" class="doc-image" id="handoff" />
Expand Down
Loading

0 comments on commit d11f5b2

Please sign in to comment.