-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathindex.json
1 lines (1 loc) · 73.6 KB
/
index.json
1
[{"category":"Document","content":"Batch document operation # Provides a efficient way to perform multiple index, create, delete, and update operations in a single request.\nExamples # POST /_bulk { \u0026#34;index\u0026#34; : { \u0026#34;_index\u0026#34; : \u0026#34;test\u0026#34;, \u0026#34;_id\u0026#34; : \u0026#34;1\u0026#34; } } { \u0026#34;field1\u0026#34; : \u0026#34;value1\u0026#34; } { \u0026#34;delete\u0026#34; : { \u0026#34;_index\u0026#34; : \u0026#34;test\u0026#34;, \u0026#34;_id\u0026#34; : \u0026#34;2\u0026#34; } } { \u0026#34;create\u0026#34; : { \u0026#34;_index\u0026#34; : \u0026#34;test\u0026#34;, \u0026#34;_id\u0026#34; : \u0026#34;3\u0026#34; } } { \u0026#34;field1\u0026#34; : \u0026#34;value3\u0026#34; } { \u0026#34;update\u0026#34; : {\u0026#34;_id\u0026#34; : \u0026#34;1\u0026#34;, \u0026#34;_index\u0026#34; : \u0026#34;test\u0026#34;} } { \u0026#34;doc\u0026#34; : {\u0026#34;field2\u0026#34; : \u0026#34;value2\u0026#34;} } The API returns the following result:\n{ \u0026#34;took\u0026#34;: 30, \u0026#34;errors\u0026#34;: false, \u0026#34;items\u0026#34;: [ { \u0026#34;index\u0026#34;: { \u0026#34;_namespace\u0026#34;: \u0026#34;default\u0026#34;, \u0026#34;_collection\u0026#34;: \u0026#34;test\u0026#34;, \u0026#34;result\u0026#34;: \u0026#34;created\u0026#34;, ... } }, { \u0026#34;delete\u0026#34;: { \u0026#34;_namespace\u0026#34;: \u0026#34;default\u0026#34;, \u0026#34;_collection\u0026#34;: \u0026#34;test\u0026#34;, \u0026#34;result\u0026#34;: \u0026#34;not_found\u0026#34;, ... } }, { \u0026#34;create\u0026#34;: { \u0026#34;_namespace\u0026#34;: \u0026#34;default\u0026#34;, \u0026#34;_collection\u0026#34;: \u0026#34;test\u0026#34;, \u0026#34;result\u0026#34;: \u0026#34;created\u0026#34;, ... } }, { \u0026#34;update\u0026#34;: { \u0026#34;_namespace\u0026#34;: \u0026#34;default\u0026#34;, \u0026#34;_collection\u0026#34;: \u0026#34;test\u0026#34;, \u0026#34;result\u0026#34;: \u0026#34;updated\u0026#34;, ... } } ] } Request # POST /_bulk POST /\u0026lt;target\u0026gt;/_bulk Path parameters # \u0026lt;target\u0026gt;\n(Required, string) Name of the collection to target. Request body # The actions are specified in the request body using a newline delimited JSON (NDJSON) structure:\naction_and_meta_data\\n optional_source\\n action_and_meta_data\\n optional_source\\n .... action_and_meta_data\\n optional_source\\n The index and create actions expect a source on the next line, and have the same semantics as the standard API: create fails if a document with the same ID already exists in the target, index adds or replaces a document as necessary.\nupdate expects that the partial doc, upsert, and script and its options are specified on the next line.\ndelete does not expect a source on the next line and has the same semantics as the standard delete API.\nBecause this format uses literal \\n\u0026rsquo;s as delimiters, make sure that the JSON actions and sources are not pretty printed.\nIf you provide a \u0026lt;target\u0026gt; in the request path, it is used for any actions that don’t explicitly specify an _index argument.\ncreate # Indexes the specified document if it does not already exist. The following line must contain the source data to be indexed.\n _namespace\n(Optional, string) Name of the namespace to perform the action on. _collection\n(Optional, string) Name of the collection to perform the action on. This parameter is required if a \u0026lt;target\u0026gt; is not specified in the request path. _index\n(Optional, string) A shortcut to specify the namespace and collection in [\u0026lt;namespace\u0026gt;:]\u0026lt;collection\u0026gt; syntax. This parameter conflicts with \u0026lt;_namespace\u0026gt; and \u0026lt;_collection\u0026gt;. _id\n(Optional, string) The document ID. If no ID is specified, a document ID is automatically generated. delete # Removes the specified document from the index.\n _namespace\n(Optional, string) Name of the namespace to perform the action on. _collection\n(Optional, string) Name of the collection to perform the action on. This parameter is required if a \u0026lt;target\u0026gt; is not specified in the request path. _index\n(Optional, string) A shortcut to specify the namespace and collection in [\u0026lt;namespace\u0026gt;:]\u0026lt;collection\u0026gt; syntax. This parameter conflicts with \u0026lt;_namespace\u0026gt; and \u0026lt;_collection\u0026gt;. _id\n(Required, string) The document ID. If no ID is specified, a document ID is automatically generated. index # Indexes the specified document. If the document exists, replaces the document and increments the version. The following line must contain the source data to be indexed.\n _namespace\n(Optional, string) Name of the namespace to perform the action on. _collection\n(Optional, string) Name of the collection to perform the action on. This parameter is required if a \u0026lt;target\u0026gt; is not specified in the request path. _index\n(Optional, string) A shortcut to specify the namespace and collection in [\u0026lt;namespace\u0026gt;:]\u0026lt;collection\u0026gt; syntax. This parameter conflicts with \u0026lt;_namespace\u0026gt; and \u0026lt;_collection\u0026gt;. _id\n(Optional, string) The document ID. If no ID is specified, a document ID is automatically generated. delete # Removes the specified document from the index.\n _namespace\n(Optional, string) Name of the namespace to perform the action on. _collection\n(Optional, string) Name of the collection to perform the action on. This parameter is required if a \u0026lt;target\u0026gt; is not specified in the request path. _index\n(Optional, string) A shortcut to specify the namespace and collection in [\u0026lt;namespace\u0026gt;:]\u0026lt;collection\u0026gt; syntax. This parameter conflicts with \u0026lt;_namespace\u0026gt; and \u0026lt;_collection\u0026gt;. _id\n(Required, string) The document ID. If no ID is specified, a document ID is automatically generated. doc # The partial document to index. Required for update operations.\n\u0026lt;fields\u0026gt; # The document source to index. Required for create and index operations.\n","subcategory":"Index","summary":"","tags":["bulk","index"],"title":"Batch document operation","url":"/docs/references/document/bulk/"},{"category":"Document","content":"Delete a document # Delete a specific document from the specified collection by specifying its unique identifier.\nExamples # Delete the document 0,0 from collection my-collection:\nDELETE /my-collection/_doc/0,0 The API returns the following result:\n{ \u0026#34;_id\u0026#34;: \u0026#34;0,0\u0026#34;, \u0026#34;result\u0026#34;: \u0026#34;deleted\u0026#34;, ... } Request # DELETE /\u0026lt;target\u0026gt;/_doc/\u0026lt;doc_id\u0026gt; Path parameters # \u0026lt;target\u0026gt;\n(Required, string) Name of the collection to target. \u0026lt;doc_id\u0026gt;\n(Required, string) Unique identifier for the document, support both _key or _id. ","subcategory":"Index","summary":"","tags":["delete","index"],"title":"Delete a document","url":"/docs/references/document/delete/"},{"category":"Document","content":"Partial update a document # Sometimes we may only need to update a portion fields of the document.\nExamples # Update the org.id field of the document news_001 in the collection my-collection:\nPUT /my-collection/_doc/news_001/_update { \u0026#34;sync\u0026#34;:{ \u0026#34;replace\u0026#34;:{ \u0026#34;org\u0026#34;: { \u0026#34;id\u0026#34;: \u0026#34;infinilabs\u0026#34; } } } } The API returns as following result:\n{\u0026#34;_id\u0026#34;:\u0026#34;0,0\u0026#34;, \u0026#34;_key\u0026#34;:\u0026#34;news_001\u0026#34;, \u0026#34;result\u0026#34;:\u0026#34;updated\u0026#34;} Pizza using the method of fetching a document, then merging partial updates and replacing it.\nRequest # POST /\u0026lt;target\u0026gt;/_doc/\u0026lt;doc_id\u0026gt;/_update { \u0026#34;sync\u0026#34;:{ \u0026lt;operation\u0026gt;: {\u0026lt;fields\u0026gt;} } \u0026#34;async\u0026#34;:{ \u0026lt;operation\u0026gt;: {\u0026lt;fields\u0026gt;} } } Pizza support both sync and async way to perform the updates, in order to update in realtime, you need to use sync here.\nIn asynchronous mode, the update process is considered complete once the request is committed to the WAL. Background tasks independently consume and process updates asynchronously, making it suitable for scenarios prioritizing update efficiency.\nPath parameters # \u0026lt;target\u0026gt;\n(Required, string) Name of the collection to target. \u0026lt;doc_id\u0026gt;\n(Required, string) The unique identify of this document, support both _key or _id. Request body # \u0026lt;operation\u0026gt; The operation supported by partial updates: add, replace, remove, array_append. \u0026lt;fields\u0026gt;\n(Required, string) The JSON format of the fields operation by partial updates. ","subcategory":"Index","summary":"","tags":["update","partial","index"],"title":"Partial update a document","url":"/docs/references/document/partial_update/"},{"category":"Document","content":"Replace a document # Replace an existing document by specifying its unique identifier and the new content.\nExamples # Replace a document news_001 of the collection my-collection with new content:\nPUT /my-collection/_doc/news_001 { \u0026#34;message\u0026#34;: \u0026#34;GET /search HTTP/1.1 200 1070000\u0026#34;, \u0026#34;org\u0026#34;: { \u0026#34;id\u0026#34;: \u0026#34;infinilabs\u0026#34; } } The API returns as following result:\n{\u0026#34;_id\u0026#34;:\u0026#34;0,0\u0026#34;, \u0026#34;_key\u0026#34;:\u0026#34;news_001\u0026#34;, \u0026#34;result\u0026#34;:\u0026#34;updated\u0026#34;} After the document modification, If you perform the fetch request:\nGET /my-collection/_doc/news_001 It returns an updated document like:\n{ \u0026#34;_id\u0026#34;: \u0026#34;0,0\u0026#34;, \u0026#34;_version\u0026#34;: 2, \u0026#34;_namespace\u0026#34;: \u0026#34;default\u0026#34;, \u0026#34;_collection\u0026#34;: \u0026#34;my-collection\u0026#34;, \u0026#34;_key\u0026#34; : \u0026#34;news_001\u0026#34;, \u0026#34;found\u0026#34;: true, \u0026#34;_source\u0026#34; : { \u0026#34;message\u0026#34;: \u0026#34;GET /search HTTP/1.1 200 1070000\u0026#34;, \u0026#34;org\u0026#34;: { \u0026#34;id\u0026#34;: \u0026#34;infinilabs\u0026#34; } } } Note that the document _version was increased to 2.\nPizza works by marking the old document as deleted and insert a new document under the hood.\nRequest # POST /\u0026lt;target\u0026gt;/_doc/\u0026lt;doc_id\u0026gt; {\u0026lt;fields\u0026gt;} Path parameters # \u0026lt;target\u0026gt;\n(Required, string) Name of the collection to target. \u0026lt;doc_id\u0026gt;\n(Required, string) The unique identify of this document, support both _key or _id. Request body # \u0026lt;fields\u0026gt;\n(Required, string) Request body contains the JSON source for the document data. ","subcategory":"Index","summary":"","tags":["replace","index"],"title":"Replace a document","url":"/docs/references/document/replace/"},{"category":"Document","content":"Fetch a document # Retrieve an existing document by specifying its unique identifier.\nExamples # Fetch a document from the my-collection collection with customized uuid news_001:\nGET /my-collection/_doc/news_001 The API returns the following result:\n{ \u0026#34;_id\u0026#34;: \u0026#34;0,0\u0026#34;, \u0026#34;_version\u0026#34;: 1, \u0026#34;_collection\u0026#34;: \u0026#34;default:my-collection\u0026#34;, \u0026#34;_key\u0026#34; : \u0026#34;news_001\u0026#34;, \u0026#34;found\u0026#34;: true, \u0026#34;_source\u0026#34; : { \u0026#34;message\u0026#34;: \u0026#34;GET /search HTTP/1.1 200 1070000\u0026#34;, \u0026#34;org\u0026#34;: { \u0026#34;id\u0026#34;: \u0026#34;infini\u0026#34; } } } As you can see, the customized uuid are represented as _key within the document, and there is also a _id returned with value 0,0, this is the internal id generated by Pizza, and it is guaranteed to be unique, so you can also fetch this document by this value like this:\nGET /my-collection/_doc/0,0 Request # GET /\u0026lt;target\u0026gt;/_doc/\u0026lt;doc_id\u0026gt; Path parameters # \u0026lt;target\u0026gt;\n(Required, string) Name of the collection to target. \u0026lt;doc_id\u0026gt;\n(Required, string) The unique identify of this document, support both _key or _id. You can also you use the HEAD method to simply find if the specified document eixsts or not.\nHEAD /\u0026lt;target\u0026gt;/_doc/\u0026lt;doc_id\u0026gt; ","subcategory":"Index","summary":"","tags":["fetch","index"],"title":"Fetch a document","url":"/docs/references/document/fetch/"},{"category":null,"content":"How realtime works? # Do you like to let your customer wait? # [Pizza in Realtime] ","subcategory":null,"summary":"","tags":null,"title":"How realtime works","url":"/docs/overview/realtime/"},{"category":"Overview","content":"Why named Pizza? # Do you wonder why this project is named Pizza?\nTo infinity scaling # Pizza solves the challenge of managing massive data seamlessly. Imagine creating a collection and continuously adding documents, from zero to petabytes, without the need to worry about sharding or reindexing. Scaling your machine becomes effortless, ensuring a smooth, seamless, and painless experience for application developers.\nSharding puzzle # One of the world\u0026rsquo;s three major challenges: What is the appropriate size for an index shard?\n [Sharding Puzzle!] Shards are like cars that transport your data, but determining how many shards you need is challenging because the amount of data is unpredictable and could continuously grow.\n [Sizing Puzzle!] Traditional sharding methods have several shortcomings. Currently, distributed system storage partitioning methods mainly include:\n Range-based partitioning methods require data to have a high dispersion in value ranges. Fixed-factor hash partitioning methods, set at database creation, may lead to over-allocation of resources if the partition factor is too large or performance issues if it\u0026rsquo;s too small. Consistent hashing algorithms lack adaptability to heterogeneous systems and flexibility in data partitioning, resulting in complex operations and suboptimal resource utilization. Is there any other approach?\nPizza\u0026rsquo;s design # Pizza does things differently!\nStart with document ID # Pizza facilitates updates, and it\u0026rsquo;s top-notch. Ensuring efficient updates requires a unique identity for each document. While accommodating a vast dataset beyond trillions of documents, one can opt for a string-based UUID or utilize a uint64 or uint128 assigned to each document. However, utilizing a wide-sized primary key may lead to resource wastage, unnecessary compression, or conversion.\nIn Pizza, document identification follows a two-dimensional approach. Each document is assigned a unique identity comprising the rolling ID (rolling_id) and the internal asigned ID (seq_doc_id) within this rolling. These IDs are structured for efficiency, incorporating partition positions for rapid data localization. Rolling IDs and assigned document IDs increment automatically. Document value ranges vary based on numeric types chosen, accommodating trillions of documents. User-defined IDs seamlessly map to unique IDs.\nWith this design:\n Assigned ID format: [rolling_id], [seq_doc_id]. Assigned IDs adopt a composite two-dimensional structure. Assigned IDs are compact numeric types designed to support massive datasets. Assigned IDs are self-descriptive, include partition positions as routing for rapid data access. Rolling ID serves as metadata-level description and does not require persistence with each record. The space allocated for sequence assigned IDs is quite compact and compression friendly. Assigned ID becomes globally identity is also good fit to support frequently updates. Within a single collection, Pizza offers varying document value ranges to accommodate different scale requirements, For [UInt8, UInt32] the capability estimated as rolling_id ranges from 0 to 255 for UInt8 types and seq_doc_id range from 0 to 4,294,967,295 for UInt32 types. Which can sustain [0 : 1,095,216,660,225] documents. if we scale rolling_id to use UInt16, then it will be [0 : 65,535], [0 : 4,294,967,295] = [0 : 281,470,681,677,825], which means 281 trillions scale, should be good start for any case.\nThese capabilities enable Pizza to handle collections of varying sizes, from smaller-scale to trillions of documents, efficiently and smoothly scale on demand.\nUser-defined IDs # \u0026ldquo;But my IDs was shipped from external database\u0026rdquo;\nThat\u0026rsquo;s fair, Pizza handle this simply do map the UUID to an unique assigned document ID.\nFor example with this document creation:\nPOST /my-collection/_doc/myid { \u0026#34;message\u0026#34;: \u0026#34;GET /search HTTP/1.1 200 1070000\u0026#34;, \u0026#34;org\u0026#34;: { \u0026#34;id\u0026#34;: \u0026#34;infini\u0026#34; } } You will get:\n{ \u0026#34;_key\u0026#34;: \u0026#34;myid\u0026#34;, \u0026#34;_id\u0026#34;: \u0026#34;0,123\u0026#34;, \u0026#34;_version\u0026#34;: 1, \u0026#34;_namespace\u0026#34;: \u0026#34;default\u0026#34;, \u0026#34;_collection\u0026#34;: \u0026#34;my-collection\u0026#34;, \u0026#34;result\u0026#34;: \u0026#34;created\u0026#34;, ... } The _id valued 0,123 means a unique Pizza document ID was assigned to myid within this collection.\nInstead of passing the UUID throughout the further process, it\u0026rsquo;s common to begin with a search, and you will get a search result. The document in the search result should contain both _id and _key. pass both of them as the document identity should work.\nJust like a Pizza # As you\u0026rsquo;ve noted, the maximum number of documents in a single rolling is a fixed size of 4,294,967,295, typically suitable for smaller use cases. The fixed capacity of single rolling is not a bug, it is a feature!\nThink of the rolling as the iron plate used for cooking the Pizza, and sending data to the rolling is akin to adding your delicious ingredients to that iron plate.\nWe ingest the data, we enjoy the pizza, just like that!\n [Yummy Pizza!] More Pizza - Rolling # So you have number of documents beyond 4,294,967,295?\nNo worries, Let\u0026rsquo;s roll to another rolling, more rolling, more pizza, the party won\u0026rsquo;t stop tonight.\n [More Pizza!] When the capacity of a rolling is exceeded, data is automatically switched to the next rolling for continued writing.\nRollings can grow infinitely to meet ongoing growth requirements.\nPackaged Pizza # There are many benefits fo package data like Pizza:\n Each rolling has a fixed size for ease of distribution and physical resource management. The number and size of shards are predictable. Shards are generated on demand, eliminating the need for advance planning. Scalability is infinite, allowing for horizontal expansion. Stable and predictable read/write performance. [Packaged Pizza!] Slicing with partition # Wait, 4.2 billion is not a small number, it may choke anyway, so we introduce partitions within rolling, just like we share slices of pizza to friends.\n [Slicing Pizza!] A single rolling can be split into a maximum of 256 physical partitions by default (configurable at creation). A lookup table is used to maintain the relationship between logical partitions and physical shards.\n [Slicing Pizza!] Physical shards and logical partitions can be dynamically split or merged. In scenarios with low write pressure, all partition data within a single shard is consolidated, appearing as a single data directory physically.\nHash based routing # How about if we have more than one rolling, how do I know which rolling contains these UUID?\nCustom or assigned IDs are hashed based to establish a one-to-one relationship with partitions:\n[HASH(KEY) or ID] % 256 = PARTITION_ID\n [Routing Slices!] The worst case is the request need to revisit same partition across all rollings, but the scope is limited as stepped with 4.2b, also we can have UUID mapping cache ahead.\nAlways better to use Pizza assigned _id rather the _key for mutation, as _rolling_id is part of _id, so Pizza know which _rolling need to talk without ask.\nUltimate scaling # Lastly, we will talk about replica, each shard can have replicas, to scale out for more search throughput.\n [Sharding Architecture] Rolling, Partition, Replica - three dimensions for ultimate scaling.\nThe more data you feed in, the more pizza you cook. Enjoy your yummy data!\n","subcategory":"Sharding","summary":"","tags":["distributed","architecture","sharding"],"title":"Why named Pizza","url":"/docs/overview/sharding/"},{"category":"Overview","content":"Architecture # Share-Nothing and Asynchronous I/O in Pizza # Pizza is built upon a robust share-nothing architecture, ensuring complete isolation of resources at both the node and per-CPU level. Each CPU core and associated threads operate independently, without sharing memory or resources with other cores or nodes. Additionally, Pizza embraces a fully asynchronous manner to access I/O and network resources, leveraging technologies like io_uring for efficient I/O operations.\n [Pizza Architecture] Why Share-Nothing and Asynchronous I/O? # The combination of share-nothing architecture and asynchronous I/O enables Pizza to seamlessly scale across large-scale datasets and high-throughput workloads, along with optimal performance and high resource utilization.\nMulti-Core Trending and Hardware Considerations # As hardware trends towards increasing numbers of cores per machine, share-nothing architectures become increasingly relevant and advantageous. With machines featuring thousands of cores becoming more common, share-nothing architectures enable efficient utilization of parallelism without encountering bottlenecks associated with shared resources.\nContention and Locking Issues # In share-everything architectures, contention for shared resources and locking mechanisms can become significant bottlenecks, especially in highly parallel environments. Share-nothing architectures eliminate these contention points by ensuring each node operates independently, avoiding the need for centralized locking mechanisms and reducing contention-related performance degradation.\nFault Isolation and Resilience # Pizza\u0026rsquo;s share-nothing architecture enhances fault isolation and system resilience. Each CPU core and thread operates autonomously, minimizing the impact of failures or performance degradation on other cores or nodes. Similarly, asynchronous I/O operations isolate I/O-related failures, ensuring that failures in one operation do not affect the execution of others.\nNUMA and Cache-Friendly Design # Pizza is designed to be NUMA-friendly and local cache or memory access-friendly. By minimizing memory access latency and optimizing performance on NUMA architectures, Pizza ensures efficient memory access and optimal performance. Additionally, its cache-friendly design eliminates the need to access remote memory in other CPU\u0026rsquo;s address spaces, reducing cache coherency overhead and improving overall performance.\nAsynchronous I/O # Pizza employs asynchronous I/O and io_uring for efficient, non-blocking I/O operations, enhancing performance and scalability. Unlike traditional synchronous I/O, which involves blocking system calls, asynchronous I/O enables Pizza to reduce latency and improve resource utilization, especially in I/O-bound scenarios. With io_uring, a high-performance asynchronous I/O framework in the Linux kernel, Pizza minimizes system call overhead and optimizes buffer management. This approach delivers improved performance, scalability, resource efficiency, and a better user experience.\n","subcategory":"Architecture","summary":"","tags":["architecture"],"title":"Architecture","url":"/docs/overview/architecture/"},{"category":"Getting Started","content":"Pizza CLI # The Pizza Command Line Interface (CLI) is a tool designed to facilitate quick and interactive communication with the Pizza server. It provides a convenient way for users to perform various tasks, such as querying data, managing configurations, and monitoring system status, directly from the command line.\n autoplay=\u0026quot;1\u0026quot; preload=\u0026quot;1\u0026quot; start-at=\u0026quot;0\u0026quot; speed=\u0026quot;2\u0026quot; \u0026gt;\u0026lt;/asciinema-player\u0026gt; Features # Interactive Querying # The Pizza CLI allows users to execute queries against the Pizza server interactively. Users can enter commands and receive immediate feedback, enabling rapid exploration and analysis of data.\nConfiguration Management # With the Pizza CLI, users can manage Pizza server configurations effortlessly. They can adjust settings, update parameters, and modify configurations on the fly, all from the command line interface.\nSystem Monitoring # The Pizza CLI provides real-time monitoring capabilities, allowing users to track system performance, monitor resource usage, and identify potential bottlenecks or issues promptly.\nUsage # To use the Pizza CLI, simply launch the command line interface and enter the desired commands. The CLI provides intuitive prompts and options to guide users through various operations.\nOptions # Start with your Pizza endpoint: ./cli http://localhost:9100/\n","subcategory":"Pizza-CLI","summary":"","tags":["installation","cli"],"title":"Pizza CLI","url":"/docs/getting-started/cli/"},{"category":null,"content":"Type Parameter # analyzer: the analyzer used for indexing\n search_analyzer: the analyzer used for searching\n index_options: controls what information is added to the inverted index, available options are:\n docs: Only the doc number is indexed. Can answer the question Does this term exist in this field? freqs: Doc number and term frequencies are indexed. Term frequencies are used to score repeated terms higher than single terms. positions (default): Doc number, term frequencies, and term positions (or order) are indexed. Positions can be used for proximity or phrase queries. offsets: Doc number, term frequencies, positions, and start and end character offsets (which map the term back to the original string) are indexed. Offsets are used by the unified highlighter to speed up highlighting. realtime: For non-object fields, enabling this option allows the field to support real-time search. For object fields, the realtime parameter overrides the realtime settings of its sub-fields.\n index: For non-object field, if set, Pizza would build index for it to make it searchable. For object fields, the index parameter overrides the index settings of its sub-fields.\n fields: It is often useful to index the same field in different ways for different purposes. This is the purpose of multi-fields\n properties: nested sub-fields\n ","subcategory":null,"summary":"","tags":null,"title":"Type Parameter","url":"/docs/references/types/type_parameter/"},{"category":"Catalog","content":"Get index setting # Returns setting information about one or more indices under the specified collection.\nExamples # Get the setting information of the my-index index under collection my-collection:\nGET /my-collection/_index/my-index/_setting Request # GET /\u0026lt;target\u0026gt;/_index/\u0026lt;index\u0026gt;/_setting GET /\u0026lt;target\u0026gt;/_index/_setting Path Parameters # target\n(Required, String) Comma-separated, names of the collections to specify (wildcard supported)\n index\n(Optional, String) Comma-separated, names of the indices to get (wildcard supported)\n ","subcategory":"Index","summary":"","tags":["setting","index"],"title":"Get index setting","url":"/docs/references/index/get_settings/"},{"category":"Catalog","content":"Get index mapping # Returns mapping information about one or more indices under the specified collection.\nExamples # Get the mapping information of the my-index index under collection my-collection:\nGET /my-collection/_index/my-index/_mapping Request # GET /\u0026lt;target\u0026gt;/_index/\u0026lt;index\u0026gt;/_mapping GET /\u0026lt;target\u0026gt;/_index/_mapping Path Parameters # target\n(Required, String) Comma-separated, names of the collections to specify (wildcard supported)\n index\n(Optional, String) Comma-separated, names of the indices to get (wildcard supported)\n ","subcategory":"Index","summary":"","tags":["mapping","index"],"title":"Get index mapping","url":"/docs/references/index/get_mapping/"},{"category":"Catalog","content":"Get index alias # Returns alias information about one or more indices under the specified collection.\nExamples # Get the alias information of the my-index index under collection my-collection:\nGET /my-collection/_index/my-index/_alias Request # GET /\u0026lt;target\u0026gt;/_index/\u0026lt;index\u0026gt;/_alias GET /\u0026lt;target\u0026gt;/_index/_alias Path Parameters # target\n(Required, String) Comma-separated, names of the collections to specify (wildcard supported)\n index\n(Optional, String) Comma-separated, names of the indices to get (wildcard supported)\n ","subcategory":"Alias","summary":"","tags":["get","alias"],"title":"Get index alias","url":"/docs/references/index/get_alias/"},{"category":"Catalog","content":"Get index # Returns information about one or more indices\nExamples # Get the information of all indices:\nGET /_index Get the information of the index named my-index under collection my-collection:\nGET /my-collection/_index/my-index Request # GET /_index GET /\u0026lt;target\u0026gt;/_index GET /\u0026lt;target\u0026gt;/_index/\u0026lt;index\u0026gt; Path Parameters # target\n(Required, String) Comma-separated, names of the collections to specify (wildcard supported)\n index\n(Required, String) Comma-separated, names of the indices to get (wildcard supported)\n ","subcategory":"Index","summary":"","tags":["get","index"],"title":"Get index","url":"/docs/references/index/get/"},{"category":"Catalog","content":"Get collection settings # Returns settings information about one or more collections.\nExamples # The following request gets the settings information of all the collections under the default namespace:\nGET /default:*/_settings Retrieve the settings information of all the collections:\nGET /_settings Request # GET /\u0026lt;target\u0026gt;/_settings Path Parameters # target\n(Optional, String) Comma-separated, names of the collections to get (wildcard supported) ","subcategory":"Collection","summary":"","tags":["get","settings"],"title":"Get collection settings","url":"/docs/references/collection/get_settings/"},{"category":"Catalog","content":"Get collection schema # Returns schema information about one or more collections.\nExamples # The following request gets the schema information of all the collections under the default namespace:\nGET /default:*/_schema Retrieve the schema information of all the collections:\nGET /_schema Request # GET /\u0026lt;target\u0026gt;/_schema Path Parameters # target\n(Optional, String) Comma-separated, names of the collections to get (wildcard supported) ","subcategory":"Collection","summary":"","tags":["get","schema"],"title":"Get collection schema","url":"/docs/references/collection/get_schema/"},{"category":"Catalog","content":"Get collection rollings # Returns rollings information about one or more collections.\nExamples # The following request gets the rollings information of all the collections under the default namespace:\nGET /default:*/_rollings Retrieve the rollings information of all the collections:\nGET /_rollings Request # GET /\u0026lt;target\u0026gt;/_rollings Path Parameters # target\n(Optional, String) Comma-separated, names of the collections to get (wildcard supported) ","subcategory":"Collection","summary":"","tags":["get","rolling"],"title":"Get collection rollings","url":"/docs/references/collection/get_rollings/"},{"category":"Catalog","content":"Get collection index # See the following documents:\n Get index Get index alias Get index mapping Get index settings ","subcategory":"Collection","summary":"","tags":["get","index"],"title":"Get collection index","url":"/docs/references/collection/get_index/"},{"category":"Catalog","content":"Get collection # Returns information about one or more collections.\nExamples # The following request gets all the collections under the default namespace:\nGET /default:* Request # GET /\u0026lt;target\u0026gt; Path Parameters # target\n(Required, String) Comma-separated, names of the collections to get (wildcard supported) ","subcategory":"Collection","summary":"","tags":["get","collection"],"title":"Get collection","url":"/docs/references/collection/get/"},{"category":"Catalog","content":"Delete a namespace # Delete a exists namespace.\nExamples # The following request delete the namespace called website:\nDELETE /_namespace/website Request # DELETE /_namespace/\u0026lt;name\u0026gt; Path parameters # \u0026lt;name\u0026gt;\n(Optional, string) The name of the namespace that you want to delete. ","subcategory":"Namespace","summary":"","tags":["delete","namespace"],"title":"Delete a namespace","url":"/docs/references/namespace/delete/"},{"category":"Catalog","content":"Delete a collection # Delete a exists collection.\nExamples # The following request deletes the collection called my-collection:\nDELETE my-collection Request # PUT /[\u0026lt;namespace\u0026gt;:]\u0026lt;name\u0026gt; Path Parameters # \u0026lt;namespace\u0026gt;\n(Optional, string) The namespace which the collection belongs to. \u0026lt;name\u0026gt;\n(Required, string) Name of the collection you wish to create. ","subcategory":"Collection","summary":"","tags":["delete","collection"],"title":"Delete a collection","url":"/docs/references/collection/delete/"},{"category":"Getting Started","content":"Configuration # Pizza supports several methods to overwrite the default configuration.\nCommand lines # ➜ ./bin/pizza --help A Distributed Real-Time Search \u0026amp; AI-Native Innovation Engine. Usage: pizza [OPTIONS] [COMMAND]\nCommands: service Builtin service management (install, uninstall, start, stop) help Print this message or the help of the given subcommand(s)\nOptions: -l, \u0026ndash;log \u0026lt;LEVEL\u0026gt; Set the logging level, options: trace,debug,info,warn,error \u0026ndash;debug Run in debug mode, panic immediately with full stack trace -c, \u0026ndash;config \u0026lt;FILE\u0026gt; -p, \u0026ndash;pid \u0026lt;FILE\u0026gt; Place pid to this file -E, \u0026ndash;override \u0026lt;KEY=VALUE\u0026gt;\n-h, \u0026ndash;help Print help -V, \u0026ndash;version Print version Configuration file #\n You can fully customize Pizza by utilizing the pizza.yaml configuration file:\n# ======================== INFINI Pizza Configuration ========================== # \u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026ndash; Log \u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026ndash; log: level: info\n# \u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026ndash; API \u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026ndash; gateway: network: binding: 127.0.0.1:9100 skip_occupied_port: true\n# \u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026ndash; Cluster \u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;- cluster: name: pizza\nnode: name: my_node_1 network: binding: 127.0.0.1:8100 skip_occupied_port: true # \u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026ndash; Storage \u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;- storage: compression: ZSTD\n# \u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026ndash; MemTable \u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash;\u0026mdash; memtable: threshold: 1k\nmax_num_of_instance: 2 allow_multi_instance: true Override configuration #\n You can tweak the configuration by passing the command line option -E with KEY=VALUE style during Pizza start:\n./bin/pizza -E log.level=trace -E gateway.network.binding=127.0.0.1:12200 ","subcategory":"Configuration","summary":"","tags":["tips","configuration files"],"title":"Configuration","url":"/docs/getting-started/configuration/"},{"category":"Observability","content":"Cluster state # Returns an internal representation of the cluster state for debugging or diagnostic purposes.\nGet the whole cluster state # Requests # GET /_cluster/state/\u0026lt;names\u0026gt; Path Parameters # names\n(Optional, string) A comma-separated list of the following options:\n _all\nShows all names. blocks\nShows the blocks part of the response. leader_node\nShows the leader_node part of the response. metadata\nShows the metadata part of the response. nodes\nShows the nodes part of the response. routing_nodes\nShows the routing_nodes part of the response. routing_table\nShows the routing_table part of the response. version\nShows the cluster state version. Get the state of a specific region # Requests # GET /_cluster/_region/\u0026lt;region_id\u0026gt;/state/\u0026lt;names\u0026gt; Path parameters # region_id\n(Required, String) The UUID of the region you want to query. A special ID _local can be specified to query the state of the region that handles this request.\n names\n(Optional, string) A comma-separated list of options, see the names parameter of the cluster state API for the full list of options.\n ","subcategory":"Cluster","summary":"","tags":["cluster","state"],"title":"Cluster state","url":"/docs/administration/observability/state/"},{"category":"Aggregation","content":"Value count aggregation # A single-value metrics aggregation that counts the number of values that are extracted from the aggregated documents. Typically, this aggregator will be used in conjunction with other single-value aggregations. For example, when computing the avg one might be interested in the number of values the average is computed over.\nvalue_count does not de-duplicate values, so even if a field has duplicates each value will be counted individually.\nExamples # Assuming the data consists of documents representing sales records we can sum the sale price of all hats with:\nPOST /sales/_search { \u0026#34;aggs\u0026#34; : { \u0026#34;types_count\u0026#34; : { \u0026#34;value_count\u0026#34; : { \u0026#34;field\u0026#34; : \u0026#34;type\u0026#34; } } } } Response:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;types_count\u0026#34;: { \u0026#34;value\u0026#34;: 7 } } } The name of the aggregation (types_count above) also serves as the key by which the aggregation result can be retrieved from the returned response.\nParameters for avg # field\n(Required, string) Field you wish to aggregate. ","subcategory":"Metric","summary":"","tags":["value_count","aggregation"],"title":"Value count aggregation","url":"/docs/references/aggregation/value-count/"},{"category":"Aggregation","content":"Terms aggregation # A multi-bucket value source based aggregation where buckets are dynamically built - one per unique value.\nExamples # POST /_search { \u0026#34;aggs\u0026#34;: { \u0026#34;genres\u0026#34;: { \u0026#34;terms\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;genre\u0026#34; } } } } Response:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;genres\u0026#34;: { \u0026#34;doc_count_error_upper_bound\u0026#34;: 0, \u0026#34;sum_other_doc_count\u0026#34;: 0, \u0026#34;buckets\u0026#34;: [ { \u0026#34;key\u0026#34;: \u0026#34;electronic\u0026#34;, \u0026#34;doc_count\u0026#34;: 6 }, { \u0026#34;key\u0026#34;: \u0026#34;rock\u0026#34;, \u0026#34;doc_count\u0026#34;: 3 }, { \u0026#34;key\u0026#34;: \u0026#34;jazz\u0026#34;, \u0026#34;doc_count\u0026#34;: 2 } ] } } } Parameters for terms # field\n(Required, string) Field you wish to aggregate. ","subcategory":"Bucket","summary":"","tags":["terms","aggregation"],"title":"Terms aggregation","url":"/docs/references/aggregation/terms/"},{"category":"Search","content":"Term query # Returns documents that contain an exact term in a provided field.\nYou can use the term query to find documents based on a precise value such as a price, a product ID, or a username.\nExamples # GET /_search { \u0026#34;query\u0026#34;: { \u0026#34;term\u0026#34;: { \u0026#34;org.id\u0026#34;: { \u0026#34;value\u0026#34;: \u0026#34;infini\u0026#34; } } } } Top-level parameters for term # \u0026lt;field\u0026gt;\n(Required, object) Field you wish to search. Parameters for \u0026lt;field\u0026gt; # value\n(Required, string) Term you wish to find in the provided \u0026lt;field\u0026gt;. To return a document, the term must exactly match the field value, including whitespace and capitalization. case_insensitive\n(Optional, Boolean) Allows ASCII case insensitive matching of the value with the indexed field values when set to true. Default is false. ","subcategory":"Query","summary":"","tags":["term","query"],"title":"Term query","url":"/docs/references/search/term/"},{"category":"Aggregation","content":"Sum aggregation # A single-value metrics aggregation that sums up numeric values that are extracted from the aggregated documents.\nExamples # Assuming the data consists of documents representing sales records we can sum the sale price of all hats with:\nPOST /sales/_search { \u0026#34;query\u0026#34;: { \u0026#34;constant_score\u0026#34;: { \u0026#34;filter\u0026#34;: { \u0026#34;match\u0026#34;: { \u0026#34;type\u0026#34;: \u0026#34;hat\u0026#34; } } } }, \u0026#34;aggs\u0026#34;: { \u0026#34;hat_prices\u0026#34;: { \u0026#34;sum\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;price\u0026#34; } } } } Resulting in:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;hat_prices\u0026#34;: { \u0026#34;value\u0026#34;: 450.0 } } } The name of the aggregation (hat_prices above) also serves as the key by which the aggregation result can be retrieved from the returned response.\nParameters for avg # field\n(Required, string) Field you wish to aggregate. ","subcategory":"Metric","summary":"","tags":["sum","aggregation"],"title":"Sum aggregation","url":"/docs/references/aggregation/sum/"},{"category":"Cluster","content":"Get Region Settings # Returns the settings configured for the region.\nRequests # GET /_cluster/_region/\u0026lt;region_id\u0026gt;/settings Path Parameters # region_id\n(Required, String) The UUID of the region you want to query. A special ID _local can be specified to query the state of the region that handles this request. Query Parameters # include_defaults\n(Optional, Boolean) If true, returns all default region settings. Defaults to false.\n flat_settings\n(Optional, Boolean) If true, returns settings in flat format. Defaults to false.\n Update Region Settings # Returns the settings configured for the region.\nRequests # PUT /_cluster/_region/\u0026lt;region_id\u0026gt;/settings Path Parameters # region_id\n(Required, String) The UUID of the region you want to query. A special ID _local can be specified to query the state of the region that handles this request. Query Parameters # flat_settings\n(Optional, Boolean) If true, returns settings in flat format. Defaults to false. Request body # The regions settings you want to update:\n{ \u0026#34;persistent\u0026#34;: { ... }, \u0026#34;transient\u0026#34;: { ... } } ","subcategory":"Region","summary":"","tags":["metadata"],"title":"Region Settings","url":"/docs/references/configuration/region_settings/"},{"category":"Search","content":"Regexp query # Returns documents that contain terms matching a regular expression.\nA regular expression is a way to match patterns in data using placeholder characters, called operators. For a list of operators supported by the regexp query, see Regular expression syntax.\nExamples # The following search returns documents where the org.id field contains any term that begins with in and ends with y. The .* operators match any characters of any length, including no characters. Matching terms can include ini, inni, and infini.\nGET /_search { \u0026#34;query\u0026#34;: { \u0026#34;regexp\u0026#34;: { \u0026#34;org.id\u0026#34;: { \u0026#34;value\u0026#34;: \u0026#34;in.*i\u0026#34;, \u0026#34;case_insensitive\u0026#34;: true } } } } Top-level parameters for range # \u0026lt;field\u0026gt;\n(Required, object) Field you wish to search. Parameters for \u0026lt;field\u0026gt; # value\n(Required, string) Regular expression for terms you wish to find in the provided \u0026lt;field\u0026gt;. For a list of supported operators, see Regular expression syntax.\ncase_insensitive\n(Optional, Boolean) Allows ASCII case insensitive matching of the value with the indexed field values when set to true. Default is false. ","subcategory":"Query","summary":"","tags":["regexp","query"],"title":"Regexp query","url":"/docs/references/search/regexp/"},{"category":"Search","content":"Range query # Returns documents that contain terms within a provided range.\nExamples # The following search returns documents where the age field contains a term between 10 and 20.\nGET /_search { \u0026#34;query\u0026#34;: { \u0026#34;range\u0026#34;: { \u0026#34;age\u0026#34;: { \u0026#34;gte\u0026#34;: 10, \u0026#34;lte\u0026#34;: 20 } } } } Top-level parameters for range # \u0026lt;field\u0026gt;\n(Required, object) Field you wish to search. Parameters for \u0026lt;field\u0026gt; # gt\n(Optional) Greater than. gte\n(Optional) Greater than or equal to. lt\n(Optional) Less than. lte\n(Optional) Less than or equal to. ","subcategory":"Query","summary":"","tags":["range","query"],"title":"Range query","url":"/docs/references/search/range/"},{"category":"Search","content":"Prefix query # Returns documents that contain a specific prefix in a provided field.\nExamples # The following search returns documents where the org.id field contains a term that begins with inf.\nGET /_search { \u0026#34;query\u0026#34;: { \u0026#34;prefix\u0026#34;: { \u0026#34;org.id\u0026#34;: { \u0026#34;value\u0026#34;: \u0026#34;inf\u0026#34; } } } } Top-level parameters for prefix # \u0026lt;field\u0026gt;\n(Required, object) Field you wish to search. Parameters for \u0026lt;field\u0026gt; # value\n(Required, string) Beginning characters of terms you wish to find in the provided \u0026lt;field\u0026gt;. case_insensitive\n(Optional, Boolean) Allows ASCII case insensitive matching of the value with the indexed field values when set to true. Default is false. ","subcategory":"Query","summary":"","tags":["prefix","query"],"title":"Prefix query","url":"/docs/references/search/prefix/"},{"category":"Aggregation","content":"Percentiles aggregation # A multi-value metrics aggregation that calculates one or more percentiles over numeric values extracted from the aggregated documents.\nPercentiles show the point at which a certain percentage of observed values occur. For example, the 95th percentile is the value which is greater than 95% of the observed values.\nPercentiles are often used to find outliers. In normal distributions, the 0.13th and 99.87th percentiles represents three standard deviations from the mean. Any data which falls outside three standard deviations is often considered an anomaly.\nWhen a range of percentiles are retrieved, they can be used to estimate the data distribution and determine if the data is skewed, bimodal, etc.\nExamples # Assume your data consists of website load times. The average and median load times are not overly useful to an administrator. The max may be interesting, but it can be easily skewed by a single slow response.\nLet\u0026rsquo;s look at a range of percentiles representing load time:\nPOST latency/_search { \u0026#34;aggs\u0026#34;: { \u0026#34;load_time_outlier\u0026#34;: { \u0026#34;percentiles\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;load_time\u0026#34; } } } } By default, the percentile metric will generate a range of percentiles: [1, 5, 25, 50, 75, 95, 99]. The response will look like this:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;load_time_outlier\u0026#34;: { \u0026#34;values\u0026#34;: { \u0026#34;1.0\u0026#34;: 10.0, \u0026#34;5.0\u0026#34;: 30.0, \u0026#34;25.0\u0026#34;: 170.0, \u0026#34;50.0\u0026#34;: 445.0, \u0026#34;75.0\u0026#34;: 720.0, \u0026#34;95.0\u0026#34;: 940.0, \u0026#34;99.0\u0026#34;: 980.0 } } } } As you can see, the aggregation will return a calculated value for each percentile in the default range. If we assume response times are in milliseconds, it is immediately obvious that the webpage normally loads in 10-725ms, but occasionally spikes to 945-985ms.\nOften, administrators are only interested in outliers — the extreme percentiles. We can specify just the percents we are interested in (requested percentiles must be a value between 0-100 inclusive):\nPOST latency/_search { \u0026#34;aggs\u0026#34;: { \u0026#34;load_time_outlier\u0026#34;: { \u0026#34;percentiles\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;load_time\u0026#34;, \u0026#34;percents\u0026#34;: [95, 99, 99.9] } } } } Parameters for avg # field\n(Required, string) Field you wish to aggregate. percents\n(Optional, array) A range of percentiles that are calculated. Default is [1, 5, 25, 50, 75, 95, 99]. keyed # By default the keyed flag is set to true which associates a unique string key with each bucket and returns the ranges as a hash rather than an array. Setting the keyed flag to false will disable this behavior:\nPOST latency/_search { \u0026#34;aggs\u0026#34;: { \u0026#34;load_time_outlier\u0026#34;: { \u0026#34;percentiles\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;load_time\u0026#34;, \u0026#34;keyed\u0026#34;: false } } } } Response:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;load_time_outlier\u0026#34;: { \u0026#34;values\u0026#34;: [ { \u0026#34;key\u0026#34;: 1.0, \u0026#34;value\u0026#34;: 10.0 }, { \u0026#34;key\u0026#34;: 5.0, \u0026#34;value\u0026#34;: 30.0 }, { \u0026#34;key\u0026#34;: 25.0, \u0026#34;value\u0026#34;: 170.0 }, { \u0026#34;key\u0026#34;: 50.0, \u0026#34;value\u0026#34;: 445.0 }, { \u0026#34;key\u0026#34;: 75.0, \u0026#34;value\u0026#34;: 720.0 }, { \u0026#34;key\u0026#34;: 95.0, \u0026#34;value\u0026#34;: 940.0 }, { \u0026#34;key\u0026#34;: 99.0, \u0026#34;value\u0026#34;: 980.0 } ] } } } ","subcategory":"Bucket","summary":"","tags":["percentile","aggregation"],"title":"Percentiles aggregation","url":"/docs/references/aggregation/percentiles/"},{"category":"Aggregation","content":"Min aggregation # A single-value metrics aggregation that keeps track and returns the minimum value among numeric values extracted from the aggregated documents.\nExamples # Computing the min price value across all documents:\nPOST /sales/_search { \u0026#34;aggs\u0026#34;: { \u0026#34;min_price\u0026#34;: { \u0026#34;min\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;price\u0026#34; } } } } Response:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;min_price\u0026#34;: { \u0026#34;value\u0026#34;: 10.0 } } } As can be seen, the name of the aggregation (min_price above) also serves as the key by which the aggregation result can be retrieved from the returned response.\nParameters for avg # field\n(Required, string) Field you wish to aggregate. ","subcategory":"Metric","summary":"","tags":["min","aggregation"],"title":"Min aggregation","url":"/docs/references/aggregation/min/"},{"category":"Aggregation","content":"Max aggregation # A single-value metrics aggregation that keeps track and returns the maximum value among the numeric values extracted from the aggregated documents.\nExamples # Computing the max price value across all documents:\nPOST /sales/_search { \u0026#34;aggs\u0026#34;: { \u0026#34;max_price\u0026#34;: { \u0026#34;max\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;price\u0026#34; } } } } Response:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;max_price\u0026#34;: { \u0026#34;value\u0026#34;: 200.0 } } } As can be seen, the name of the aggregation (max_price above) also serves as the key by which the aggregation result can be retrieved from the returned response.\nParameters for avg # field\n(Required, string) Field you wish to aggregate. ","subcategory":"Metric","summary":"","tags":["max","aggregation"],"title":"Max aggregation","url":"/docs/references/aggregation/max/"},{"category":"Getting Started","content":"Installation # Pizza is compatible with all major operating systems. The package is compiled statically, and it does not require any external dependencies.\nAutomatic installation # Use the following command to automatically download the latest version of INFINI Pizza for your platform and extract it into /opt/pizza:\ncurl -sSL http://get.infini.cloud | bash -s -- -p pizza The optional parameters for the script are as follows:\n -v \u0026lt;version number\u0026gt; (default is the latest version) -d \u0026lt;installation directory\u0026gt; (default is /opt/pizza) Manual installation # Visit the URL below to download the package for your operating system:\nhttps://release.infinilabs.com/\nVerification of the installation # Assuming Pizza is in your $PATH after installation, run the following command to ensure the package has been installed correctly:\n$ pizza --version PIZZA 0.1.0 Starting the server # Start Pizza as follows with the configuration:\n$ pizza --config pizza.yaml ___ _____ __________ _ / _ \\\\_ \\/ _ / _ / /_\\ / /_)/ / /\\/\\// /\\// / //_\\\\ / ___/\\/ /_ / //\\/ //\\/ _ \\ \\/ \\____/ /____/____/\\_/ \\_/ [PIZZA] The Next-Gen Real-Time Hybrid Search \u0026amp; AI-Native Innovation Engine. \u0026hellip; Interaction with the server #\n Assuming Pizza is listening on 127.0.0.1:9200, use the following command to create a collection named testing:\ncurl -XPUT http://127.0.0.1:9200/testing Refer to the reference page for more APIs.\nShutdown the server # Press Ctrl+C to shut down Pizza, and the message below is displayed:\n... __ _ __ ____ __ _ __ __ / // |/ // __// // |/ // / / // || // _/ / // || // / /_//_/|_//_/ /_//_/|_//_/ ©INFINI.LTD, All Rights Reserved. \n","subcategory":"Pizza-server","summary":"","tags":["installation","pizza-server"],"title":"Installation","url":"/docs/getting-started/installation/"},{"category":"Catalog","content":"Delete an index # Deletes an existing index under a collectioin.\nExamples # The following request deletes the index called my-index under collection my-namespace:my-collection\nDELETE /my-namespace:my-collection/_index/my_index Request # DELETE /\u0026lt;target\u0026gt;/_index/\u0026lt;name\u0026gt; Path parameters # \u0026lt;target\u0026gt;\n(Required, string) The collection which the index will be removed from.\n \u0026lt;name\u0026gt;\n(Required, string) Name of the index you wish to delete.\n ","subcategory":"Index","summary":"","tags":["delete","index"],"title":"Delete an index","url":"/docs/references/index/delete/"},{"category":"Aggregation","content":"Date histogram aggregation # This multi-bucket aggregation is similar to the normal histogram, but it can only be used with date or date range values. Because dates are represented internally in Elasticsearch as long values, it is possible, but not as accurate, to use the normal histogram on dates as well. The main difference in the two APIs is that here the interval can be specified using date/time expressions. Time-based data requires special support because time-based intervals are not always a fixed length.\nExamples # As an example, here is an aggregation requesting bucket intervals of a month in calendar time:\nPOST /sales/_search { \u0026#34;aggs\u0026#34;: { \u0026#34;sales_over_time\u0026#34;: { \u0026#34;date_histogram\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;date\u0026#34;, \u0026#34;calendar_interval\u0026#34;: \u0026#34;1M\u0026#34; } } } } Response:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;sales_over_time\u0026#34;: { \u0026#34;buckets\u0026#34;: [ { \u0026#34;key\u0026#34;: 1420070400000, \u0026#34;doc_count\u0026#34;: 3 }, { \u0026#34;key\u0026#34;: 1422748800000, \u0026#34;doc_count\u0026#34;: 2 }, { \u0026#34;key\u0026#34;: 1425168000000, \u0026#34;doc_count\u0026#34;: 2 } ] } } } Parameters for date_histogram # field\n(Required, string) Field you wish to aggregate. calendar_interval # (Optional, string) Calendar-aware intervals are configured with the calendar_interval parameter. You can specify calendar intervals using the unit name, such as month, or as a single unit quantity, such as 1M. For example, day and 1d are equivalent. Multiple quantities, such as 2d, are not supported.\nThe accepted calendar intervals are:\n minute, 1m\nAll minutes begin at 00 seconds. One minute is the interval between 00 seconds of the first minute and 00 seconds of the following minute in the specified time zone, compensating for any intervening leap seconds, so that the number of minutes and seconds past the hour is the same at the start and end. hour, 1h\nAll hours begin at 00 minutes and 00 seconds. One hour (1h) is the interval between 00:00 minutes of the first hour and 00:00 minutes of the following hour in the specified time zone, compensating for any intervening leap seconds, so that the number of minutes and seconds past the hour is the same at the start and end. day, 1d\nAll days begin at the earliest possible time, which is usually 00:00:00 (midnight). One day (1d) is the interval between the start of the day and the start of the following day in the specified time zone, compensating for any intervening time changes. week, 1w\nOne week is the interval between the start day_of_week:hour:minute:second and the same day of the week and time of the following week in the specified time zone. month, 1M\nOne month is the interval between the start day of the month and time of day and the same day of the month and time of the following month in the specified time zone, so that the day of the month and time of day are the same at the start and end. quarter, 1q\nOne quarter is the interval between the start day of the month and time of day and the same day of the month and time of day three months later, so that the day of the month and time of day are the same at the start and end. year, 1y\nOne year is the interval between the start day of the month and time of day and the same day of the month and time of day the following year in the specified time zone, so that the date and time are the same at the start and end. fixed_interval # Fixed intervals are configured with the fixed_interval parameter.\nIn contrast to calendar-aware intervals, fixed intervals are a fixed number of SI units and never deviate, regardless of where they fall on the calendar. One second is always composed of 1000ms. This allows fixed intervals to be specified in any multiple of the supported units.\nHowever, it means fixed intervals cannot express other units such as months, since the duration of a month is not a fixed quantity. Attempting to specify a calendar interval like month or quarter will throw an exception.\nThe accepted units for fixed intervals are:\n milliseconds (ms)\nA single millisecond. This is a very, very small interval. seconds (s)\nDefined as 1000 milliseconds each. minutes (m)\nDefined as 60 seconds each (60,000 milliseconds). All minutes begin at 00 seconds. hours (h)\nDefined as 60 minutes each (3,600,000 milliseconds). All hours begin at 00 minutes and 00 seconds. days (d)\nDefined as 24 hours (86,400,000 milliseconds). All days begin at the earliest possible time, which is usually 00:00:00 (midnight). ","subcategory":"Bucket","summary":"","tags":["date","histogram","aggregation"],"title":"Date histogram aggregation","url":"/docs/references/aggregation/date-histogram/"},{"category":"Catalog","content":"Create an index # Creates a new index under a collectioin.\nExamples # The following request creates a new index called my-index under collection my-namespace:my-collection\nPUT /my-namespace:my-collection/_index/my_index Request # PUT /\u0026lt;target\u0026gt;/_index/\u0026lt;name\u0026gt; Path parameters # \u0026lt;target\u0026gt;\n(Required, string) The collection which the index belongs to.\n \u0026lt;name\u0026gt;\n(Required, string) Name of the index you wish to create.\n ","subcategory":"Index","summary":"","tags":["create","index"],"title":"Create an index","url":"/docs/references/index/create/"},{"category":"Catalog","content":"Create a namespace # Creates a new namespace.\nExamples # If creating a website namespace, the following request creates a new namespace called website:\nPUT /_namespace/website Request # PUT /_namespace/\u0026lt;name\u0026gt; Path parameters # \u0026lt;name\u0026gt;\n(Required, string) The name of the namespace. Namespace names must meet the following criteria: Lowercase only Cannot include \\ /, *, ?, \u0026quot;, \u0026lt;, \u0026gt;, |, , ,, # Cannot start with -, _, + Cannot be . or .. Cannot be longer than 255 bytes (note it is bytes, so multi-byte characters will count towards the 255 limit faster) ","subcategory":"Namespace","summary":"","tags":["create","namespace"],"title":"Create a namespace","url":"/docs/references/namespace/create/"},{"category":"Document","content":"Create a document # Creates a new document.\nExamples # Insert a JSON document into the my-collection collection:\nPOST /my-collection/_doc { \u0026#34;message\u0026#34;: \u0026#34;GET /search HTTP/1.1 200 1070000\u0026#34;, \u0026#34;org\u0026#34;: { \u0026#34;id\u0026#34;: \u0026#34;infini\u0026#34; } } The API returns the following result:\n{ \u0026#34;_id\u0026#34;: \u0026#34;0,0\u0026#34;, \u0026#34;_version\u0026#34;: 1, \u0026#34;_namespace\u0026#34;: \u0026#34;default\u0026#34;, \u0026#34;_collection\u0026#34;: \u0026#34;my-collection\u0026#34;, \u0026#34;result\u0026#34;: \u0026#34;created\u0026#34;, ... } The API supports passing a customized UUID as the document identify, eg:\nPOST /my-collection/_doc/news_001 { \u0026#34;message\u0026#34;: \u0026#34;GET /search HTTP/1.1 200 1070000\u0026#34;, \u0026#34;org\u0026#34;: { \u0026#34;id\u0026#34;: \u0026#34;infini\u0026#34; } } Request # POST /\u0026lt;target\u0026gt;/_doc/[\u0026lt;doc_id\u0026gt;] {\u0026lt;fields\u0026gt;} Path parameters # \u0026lt;target\u0026gt;\n(Required, string) Name of the collection to target. \u0026lt;doc_id\u0026gt;\n(Optional, string) The unique identify of the document, auto generated if not specified. Request body # \u0026lt;fields\u0026gt;\n(Required, string) Request body contains the JSON source for the document data. ","subcategory":"Index","summary":"","tags":["create","index"],"title":"Create a document","url":"/docs/references/document/create/"},{"category":"Catalog","content":"Create a collection # Creates a new collection.\nExamples # The following request creates a new collection called my-collection in the namespace my-namespace:\nPUT /my-namespace:my-collection If creating a collection within the default namespace, it can be simplified as:\nPUT /my-collection Request # PUT /[\u0026lt;namespace\u0026gt;:]\u0026lt;name\u0026gt; Path parameters # \u0026lt;namespace\u0026gt;\n(Optional, string) The namespace which the collection belongs to. Namespace names must meet the following criteria: Lowercase only Cannot include \\ /, *, ?, \u0026quot;, \u0026lt;, \u0026gt;, |, , ,, # Cannot start with -, _, + Cannot be . or .. Cannot be longer than 255 bytes (note it is bytes, so multi-byte characters will count towards the 255 limit faster) \u0026lt;name\u0026gt;\n(Required, string) Name of the collection you wish to create. Collection names must meet the same criteria as namespace names. Query parameters # \u0026lt;wait_for_active_shards\u0026gt; \\\n(Optional, string) The number of copies of each shard that must be active before proceeding with the operation. Set to all or any non-negative integer up to the total number of copies of each shard in the collection (number_of_replicas+1).\nDefaults to 1, meaning to wait just for each primary shard to be active.\n \u0026lt;timeout\u0026gt;\n(Optional, time units) Period the request waits for the following operations:\n Waiting for active shards Defaults to 1m (one minute).\n Request body # settings\nCollection settings\n rolling.partitions_sharding_strategy\n(Optional, object) Specifies the default value for rolling setting partitions_sharding_strategy, which is used to configure the initial number of primary shards and how partitions are assigned to them for a rolling.\nDefault value: create 1 shard and assign all partitions to it.\nSupported strategies are listed below:\n Hash\n\u0026#34;hash\u0026#34;: { \u0026#34;number_of_shards\u0026#34;: \u0026lt;num_shards\u0026gt; } One can specify the total number of shards, partitions are assigned to shards in this way:\nshard_index = hash(partition_id) mod number_of_shards Range\n\u0026#34;range\u0026#34;: [\u0026#34;0..128, 129..255\u0026#34;] An array of partition range needs to be provided, every array item represents a shard, the partitions specified in it will be assigned to the shard.\nThe above example evenly assigns 256 partitions to 2 shards.\n rolling.number_of_replicas\n(Optional, Integer) Specify the default value for rolling setting number_of_replicas, which controls the number of replicas a primary shard will have. Defaults to 1.\n analysis.analyzer.default.type\n(Optional, string) Collection-level default index analyzer, for text fields only.\n analysis.analyzer.default_search.type\n(Optional, string) Collection-level default search analyzer, for text fields only.\n schema\nInitial collection schema\n dynamic\n(Optional, Dynamic) Controls whether new fields are added dynamically.\nAvailable options:\n true: New fields are added to the schema (default). false: These fields will not be indexed or searchable, but will still appear in the _source field of returned hits. These fields will not be added to the schema, and new fields must be added explicitly. strict: If new fields are detected, an exception is thrown and the document is rejected. New fields must be explicitly added to the schema. properties\n(Optional, object) Initial fields properties.\nDefault value: no properties will be set.\nExample value:\n\u0026#34;properties\u0026#34;: { \u0026#34;id\u0026#34;: { \u0026#34;type\u0026#34;: \u0026#34;integer\u0026#34; }, \u0026#34;history\u0026#34;: { \u0026#34;type\u0026#34;: \u0026#34;text\u0026#34;, \u0026#34;analyzer\u0026#34;: \u0026#34;standard\u0026#34;, \u0026#34;search_analyzer\u0026#34;: \u0026#34;whitespace\u0026#34; } } For more information about the types supported by Pizza, please refer to Types.\n ","subcategory":"Collection","summary":"","tags":["create","collection"],"title":"Create a collection","url":"/docs/references/collection/create/"},{"category":"Overview","content":"Concepts # Pizza is a distributed search engine designed to efficiently index and retrieve documents across large-scale datasets. It organizes data in a hierarchical structure, allowing for flexible management and retrieval capabilities.\nBefore you start using Pizza, familiarize yourself with the following key concepts:\n [Pizza Concepts] Concepts # Cluster # A cluster represents a set of interconnected nodes that collectively form the Pizza search engine. Nodes within a cluster collaborate to store and process data efficiently. Clusters can span multiple physical locations for fault tolerance and scalability.\nZone # A zone is a logical grouping of nodes within a cluster. Zones are typically organized based on geographic proximity or network topology. They facilitate data replication and fault tolerance strategies by ensuring redundancy across different zones.\nRegion # A region is a further subdivision within a zone, typically representing a smaller geographical area or a distinct network segment. Regions help optimize data access and reduce latency by distributing data closer to users or applications.\nNamespace # Pizza support multi-tenant by design. A namespace is a logical container for collections of related data. It serves as a namespace for collections, providing isolation and organization. Namespaces can be used to group data according to different criteria such as application domain, user, or data type.\nCollection # A collection is a grouping of documents with similar characteristics or attributes. Collections represent the primary unit of storage and retrieval within Pizza. Each collection is vertically partitioned into \u0026ldquo;rollings\u0026rdquo; to efficiently manage large datasets.\nRolling # A rolling is a vertical partition of the entire collection dataset. Each rolling contains a subset of documents, with a maximum limit of 4.2 billion documents per rolling. Documents within a rolling are assigned an auto-increment sequence document ID, which is a uint32. Once a rolling is filled, the next rolling is automatically assigned, ensuring infinite scalability.\nPartition # A partition is a logical split and separation of data within a single rolling. Fixed at 256 partitions per rolling, partitions enable horizontal scalability and performance optimization by distributing data across multiple shards. Partitions are dynamically mapped to shards and can be scaled out or merged for better search performance.\nShard # A shard is a physical container for partitions within a single rolling of a collection. Each rolling can have a different setup of shards, allowing for customized scalability and performance optimization. Shards contain partitions within a single rolling, enabling efficient data distribution and retrieval.\nDocument # A document represents a unit of data indexed by the Pizza search engine. Documents can be of various types, such as text, images, or structured data. Each document contains fields that store specific attributes or properties, making it searchable and retrievable.\nField # A field is a specific attribute or property of a document. Fields contain the actual data that is indexed and searched within documents. Examples of fields include title, content, author, date, etc.\nStore # In Pizza, the \u0026ldquo;Store\u0026rdquo; refers to the primary storage for documents, also known as forward records. By default, it utilizes Parquet for storage, with the option to integrate other external storage types in the future.\nIndex # An index is a data structure used to efficiently retrieve documents based on search queries. It maps terms or keywords to the documents containing those terms, enabling fast lookup and retrieval. Indices are built and maintained based on the fields within documents.\nRelationships # Cluster to Zone/Region # A cluster consists of one or more zones, which may further contain multiple regions. Zones and regions facilitate data replication and fault tolerance strategies within the cluster.\nNamespace to Collection # Namespaces contain one or more collections, providing a logical grouping for related data. Collections within the same namespace share common management and access policies.\nCollection to Rolling # As data within a Collection grows, it\u0026rsquo;s vertically partitioned into Rollings to manage large datasets efficiently. Rollings represent vertical partitions of a Collection\u0026rsquo;s dataset, allowing dynamic scaling and efficient querying of subsets of data.\nRolling to Partition # Single Rolling are horizontally partitioned into 256 partitions, each containing a subset of documents. Partitions enable horizontal scalability and performance optimization within a rolling.\nPartition to Shard # Partitions are dynamically mapped to shards within a collection. Shards can scale out to multiple shards or merge back into a single shard for improved search performance and resource utilization.\nDocument to Field # Documents consist of fields that store specific attributes or properties. Fields enable structured indexing and searching of documents based on their content.\nStore to Index # The store persists documents and associated metadata, while indices facilitate efficient retrieval of documents based on search queries. Stores and indices work together to provide fast and reliable data access within the Pizza search engine.\n","subcategory":"Concepts","summary":"","tags":["concept"],"title":"Concepts","url":"/docs/overview/concept/"},{"category":"Observability","content":"Cluster health # Returns the cluster health for quick overview.\nGet the whole cluster health # Requests # GET /_cluster/health Query Parameters # level\n(Optional, string) Can be one of cluster, regions, collections rollings or shards. Controls the details level of the health information returned. Defaults to cluster. Get the health of a specific region # Requests # GET /_cluster/_region/\u0026lt;region_id\u0026gt;/health Path parameters # region_id\n(Required, String) The UUID of the region you want to query. A special ID _local can be specified to query the state of the region that handles this request. Query Parameters # level\n(Optional, string) Can be one of regions, collections rollings or shards. Controls the details level of the health information returned. Defaults to regions. ","subcategory":"Cluster","summary":"","tags":["cluster","health"],"title":"Cluster health","url":"/docs/administration/observability/health/"},{"category":"Aggregation","content":"Avg aggregation # A single-value metrics aggregation that computes the average of numeric values that are extracted from the aggregated documents.\nExamples # Assuming the data consists of documents representing exams grades (between 0 and 100) of students we can average their scores with:\nPOST /exams/_search { \u0026#34;aggs\u0026#34;: { \u0026#34;avg_grade\u0026#34;: { \u0026#34;avg\u0026#34;: { \u0026#34;field\u0026#34;: \u0026#34;grade\u0026#34; } } } } The above aggregation computes the average grade over all documents. The aggregation type is avg and the field setting defines the numeric field of the documents the average will be computed on. The above will return the following:\n{ ... \u0026#34;aggregations\u0026#34;: { \u0026#34;avg_grade\u0026#34;: { \u0026#34;value\u0026#34;: 75.0 } } } The name of the aggregation (avg_grade above) also serves as the key by which the aggregation result can be retrieved from the returned response.\nParameters for avg # field\n(Required, string) Field you wish to aggregate. ","subcategory":"Metric","summary":"","tags":["avg","aggregation"],"title":"Avg aggregation","url":"/docs/references/aggregation/avg/"}]