FTS Expression Guide
A complete guide to Full Text Search expressions.
sub.example.com •Cipr: Cosmic Index of Public Resources
The Cosmic[1] Index of Public Resources, or Cipr[2], is a decentralized, distributed, independent, public, universal, dynamic and queryable directory of websites and other reachable-by-DNS-resolution resources in the Internet.
The Cipr shares some features with conventional search engines, web directories and webrings. However, adding entries to the Cipr does not require crawling the web or the approval of curators or editors, this is because of its decentralized and user-controlled nature.
This idea is really simple. It’s surprising that something like this hasn’t been the standard resource indexing system for the World Wide Web since its inception.
With the Cipr, every content publisher owns their entries in the index, meaning, they can include, update or exclude them at will. It is the publisher ―the domain name owner― who decides when and how their resource is indexed or not.
The factors that determine the ranking position of search results cannot be obscured in the Cipr, they are standardized, public and auditable.
The equivalent to the SEO[3] activity in the Cipr is very basic, a publisher only needs to use the right and consistent-in-time information about their resources (title, description, keywords, primary language and localization data) to make them visible to their target audience, nothing else.
Censoring, banning, blocking or filtering a Cipr indexed resource is only possible through DNS censoring, banning, blocking or filtering.
The worldwide availability of any inclusion, update or exclusion to the Cipr is expected to take only a few minutes.
Having a website or any other Internet resource effectively indexed in the Cipr is a matter of:
- Owning a domain name, e.g. example.com.
- Deploying a simple demon in the Internet: a ciprnode.
- Authorizing your ciprnode to add a couple of records to your DNS zone (or doing it manually).
Technical overview
In this specification, a resource refers to whatever a zone apex[4] points to, as well as any subdomain beneath it, so, any resource that is effectively indexed in the Cipr is referred as a cipred resource and identified by its Zone Apex or za, which will always be: sldl.tldl (Second Level Domain . Top Level Domain).
The Cipr is built upon a set of software components, network elements, protocols, services, policies, and constraints that ensure its completeness, integrity, availability, responsiveness, accuracy, reliability, and up-to-dateness. These components are:
- Domain Name System: The existing Internet’s naming system.
- Ciprnodes: A type of demon whose swarm enables the existence of the Cipr.
A ciprnode is composed of:
1. Ciprdup: Queryable copy of the Cipr in every ciprnode.
2. Resindex: Queryable index of the cipred resource in every ciprnode.
3. CiprAPI: The API exposed in every ciprnode for syncing and searching tasks.
4. Ciprpulse: The automated set of ciprnode-syncing tasks.
5. Ciprface: Web interface to search the Cipr and the existing resindexes.
Domain Name System
The DNS is the old, trusted, ubiquitous hierarchical and decentralized naming system used to identify resources on the Internet; the Cipr uses it to:
1. Verify entries existence by validating their presence in the Domain Name System.
2. Verify entries correctness by validating the specifics of a particular TXT record in the Domain Name System.
Extending the verification tasks to any known DNS Root Zone alternative[5] is technically possible, and may even be desirable at some point.
Ciprnodes
Most of the functions to keep the Cipr running rely on its ciprnodes. A ciprnode is a daemon whose main function is to hold a queryable copy of the Cipr and keep it in sync with all the other copies on the rest of the ciprnodes.
Second function of a ciprnode is to act as an entry point for search requests to the Cipr and the the resource it indexes.
Each ciprnode must be published following this pattern:
https://ciprnode.{za}
Where {za} is the same as sldl.tldl. The literal ciprnode must be the third level domain (3LD) label assigned to the demon, for example:
https://ciprnode.cipr.info
Important note: For some country code top-level domains (ccTLDs) the registration of second level domains is restricted or forbidden, this means that resources like bbc.co.uk, up.edu.br or ivic.gob.ve CAN NOT be indexed in the Cipr, this is because allowing ciprnodes under the 3LD allows the inclusion of infinite ciprnodes under a single za.
[ciprsys]
1. Ciprdup
A ciprdup is the working copy of the Cipr in each ciprnode. It’s probably ―but not mandatorily― a table or a group of tables in a RDBMS.
The fields ―or columns― of the ciprdup are: za, title, description, ol, latitude, longitude and timestamp.
-----------------------------------------------------------------------------------------------------------------------------------------
za title description keywords offering seeking primary_lang ol latitude longitude timestamp
-------------- --------------- -------------- ------------ ---------- ---------- -------------- ---- ----------- ----------- ------------
pali.to Little Stick Stick polo star sticks es 1698417000
meansite.com We are Mean Offensive truck ala zh 2 407128000 407128000 1698417000
foobar.org Foobar Zone Foobar late ola en 1698417000
example.com Example Site For examples rat table 407128000 407128000 1698417000
elcoco.buh All Offense Very Gross cigar tool ur 3 1698417000
cipr.info Specification Cipr spec pose pork RCU devs es 407128000 407128000 1698417000
-----------------------------------------------------------------------------------------------------------------------------------------
: Ciprdup fields with example data shown as table rows in a RDBMS:
The fields of the ciprdup are:
za
Zone Apex of the resource in the Domain Name System.
- Constrains:
- Allowed values/length: /^[\p{L}\p{N}](?:[\p{L}\p{N}-]{0,61}[\p{L}\p{N}])?\.[\p{L}\p{N}](?:[\p{L}\p{N}-]{0,61}[\p{L}\p{N}])?$/u
- empty allowed: no.
- Primary key: yes.
- FTS searchable: yes.
title
The indexed resource’s title.
- Constrains:
- Allowed values/length: /^[^\r\n\u2028\u2029]{1,64}$/u
- empty allowed: no.
- Primary key: no.
- FTS searchable: yes.
description
The resource’s description.
- Constrains:
- Allowed values/length: /^[^\r\n\u2028\u2029]{1,256}$/u
- empty allowed: no.
- Primary key: no.
- FTS searchable: yes.
keywords
Keywords for the resource.
- Constrains:
- Allowed values/length: /^[^\r\n\u2028\u2029]{1,512}$/u
- empty allowed: no.
- Primary key: no.
- FTS searchable: yes.
offering
What is offered or shared through the resource.
- Constrains:
- Allowed values/length: /^[^\r\n\u2028\u2029]{1,128}$/u
- empty allowed: yes.
- Primary key: no.
- FTS searchable: yes.
seeking
What the owner of the resource is looking for.
- Constrains:
- Allowed values/length: /^[^\r\n\u2028\u2029]{1,128}$/u
- empty allowed: yes.
- Primary key: no.
- FTS searchable: yes.
primary_lang
The primary language for the resource.
- Constrains:
- Allowed values/length: /^[a-z]{1,2}$/u
- empty allowed: yes.
- Primary key: no.
- FTS searchable: no.
ol
Offensiveness level, a subjective indicator of how offensive the resource content could be from its publisher’s point of view.
Taking as a starting point that in this context a group is any community, congregation, circle, clan, league, tribe, collective, gang, faction, union, guild or any other form of association based on: sexual orientation, social position, region, ethnicity, culture, nationality, age, profession, gender identity, political views, religious views, ideological views or any other type of affinity; the possible values for the ol field are:
empty: Non Offensive Content, indicates the content is not offensive to any person or social group.
1: Individually Offensive Content, indicates the content could be offensive to specific individuals, to one or more specific persons not related by any particular type of affinity between them.
2: Collectively Offensive Content, indicates the content could be offensive to two or more members of one or more specific groups.
3: Universally Offensive Content, used when the publisher considers the offensiveness of their content is transversal to most social groups in the whole world.
It is suggested to provide extra information in the description field to clarify why the resource is considered offensive.
- Constrains:
- Allowed values/length: /^[1-3]$/
- empty allowed: yes.
- Primary key: no.
- FTS searchable: no.
latitude
Geographic latitude of the resource, the integer value resulting of multiplying the real number that represents the latitude coordinate in WGS 84 (EPSG:4326) format by 10000000. The publisher is free to decide the level of precision to use.
- Constrains:
- Allowed values/length: /^[\d]{9}$/
- empty allowed: yes, only if longitude is also empty.
- Primary key: no.
- FTS searchable: no.
longitude
Geographic longitude of the resource, the integer value resulting of multiplying the real number that represents the longitude coordinate in WGS 84 (EPSG:4326) format by 10000000. The publisher is free to decide the level of precision to use.
- Constrains:
- Allowed values/length: /^[\d]{9}$/
- empty allowed: yes, only if latitude is also empty.
- Primary key: no.
- FTS searchable: no.
timestamp
Coordinated Universal Time (UTC) timestamp of the last update of the resource represented with a valid Unix Epoch timestamp (seconds since 1970-01-01T00:00:00Z).
- Constrains:
- Allowed values/length: /^[\d]{10}$/
- empty allowed: no.
- Primary key: no.
- FTS searchable: no.
2. Resindex
A resindex is the indexed content of a cipred resource. The creation of the resindex is the exclusive responsibility of each publisher, how to create it depends on them (Pagefind, YaCy, Meilisearch, LLM/RAG tools, etc.) but, no matter how or when it is generated, the resindex must be queryable through the CiprAPI in a standard way.
The use of a resindex isn’t mandatory, but having it is extremely convenient for the publisher; this is an optional but desirable component.
3. CiprAPI
CiprAPI is a strict Semantic RESTful Web API used by every ciprnode to:
- Query the ciprdup and the resindex of the ciprnode
- Maintain the ciprdup in sync with the Cipr
- Audit its peers to guarantee the trustability, reliability and up-to-dateness of the Cipr
The CiprAPI supports the following media types for the information exchange:
HAL, when the Accept: header includes any of the following media types:
- application/hal+json
- application/hal+json; charset=utf-8
- application/hal+xml
- application/hal+xml; charset=utf-8
Plain text, when the Accept: header includes the following media types:
- text/plain
- text/plain; charset=utf-8
HTML chunks or fragments, when the header HX-Request: is present and true, and the Accept: header is absent or present with one of the following media types:
- */*
- text/html
- text/html; charset=utf-8
- application/xhtml+xml
- application/xhtml+xml; charset=utf-8
Full HTML with HEAD and BODY tags[6], when the header HX-Request: is absent or has false value, and the Accept: header is absent or includes the following media types:
- */*
- text/html
- text/html; charset=utf-8
- application/xhtml+xml
- application/xhtml+xml; charset=utf-8
No matter if it is requested or not, UTF-8 must be used always in any response and is assumed as the default charset for any request and response.
The CiprAPI exposes the following endpoints:
- GET / - Retrieves the contents of the ciprdup.
- GET /{za}/ - Retrieves all fields for a specific cipred resource.
- GET /{za}/title/ - Retrieves the title of a specific cipred resource.
- GET /{za}/description/ - Retrieves the description of a specific cipred resource.
- GET /{za}/keywords/ - Retrieves the keywords of a specific cipred resource.
- GET /{za}/offering/ - Retrieves the offering being made through a specific cipred resource.
- GET /{za}/seeking/ - Retrieves the seeking being made through a specific cipred resource.
- GET /{za}/ol/ - Retrieves the value of the offensiveness level of a specific cipred resource.
- GET /{za}/primary_lang/ - Retrieves the value of the primary language of a specific cipred resource.
- GET /{za}/latitude/ - Retrieves the latitude of a specific cipred resource.
- GET /{za}/longitude/ - Retrieves the longitude of a specific cipred resource.
- GET /{za}/timestamp/ - Retrieves the timestamp of a specific cipred resource.
- GET /languages/ - Retrieves the contents of the languages database table. Only to be called from the local ciprface.
- PUT /{za}/ - Adds a new cipred resource to the Cipr.
- DELETE /{za}/ - Removes a cipred resource from the Cipr.
- QUERY / - Queries the ciprdup of the ciprnode with a given FTS expression+filters.
- QUERY /ri/ - Queries the resindex (ri) of the cipred resource with a given expression.
- HEAD / - Verifies the presence of a ciprnode in the Cipr.
- HEAD /ri/ - Verifies the presence of a resindex (ri) in the ciprnode.
Use of the GET method
A GET request to / accepts the pages[size] query parameter, being size an integer (n) indicating the expected number of entries. The entries in the Cipr are not expected to be ordered, so pagination is not feasible. A GET request to /{za}/ will retrieve only one row with all the fields for a specific cipred resource or only one row with a specific field. All GET endpoints support content negotiation via the Accept header. Examples:
This request asks the Cipr to retrieve the full Cipr[7]:
GET /
Host: ciprnode.example.com
This request asks the Cipr to retrieve 2048 entries:
GET /?pages[size]=2048 HTTP/1.1
Host: ciprnode.example.com
This request asks the Cipr to retrieve the row corresponding to the barriteau.net zone apex as HAL JSON:
GET /barriteau.net/ HTTP/1.1
Host: ciprnode.guasa.art
Accept: application/hal+json
HTTP/1.1 200 OK
Content-Type: application/hal+json; charset=utf-8
{
"za": "barriteau.net",
"title": "Barriteau",
"description": "The Barriteau resource",
"keywords": "barriteau net example",
"offering": null,
"seeking": null,
"ol": null,
"latitude": null,
"longitude": null,
"timestamp": 1698417000,
"primary_lang": "en",
"_links": {
"self": { "href": "/barriteau.net/" },
"collection": { "href": "/" }
}
}
This request asks the Cipr to retrieve the title of the barriteau.net cipred resource as plain text:
GET /barriteau.net/title/ HTTP/1.1
Host: ciprnode.cipr.info
Accept: text/plain
HTTP/1.1 200 OK
Content-Type: text/plain; charset=utf-8
Barriteau
This request retrieves the same field as HAL JSON, which includes HATEOAS links to all sibling fields:
GET /barriteau.net/title/ HTTP/1.1
Host: ciprnode.cipr.info
Accept: application/hal+json
HTTP/1.1 200 OK
Content-Type: application/hal+json; charset=utf-8
{
"title": "Barriteau",
"_links": {
"self": { "href": "/barriteau.net/title/" },
"up": { "href": "/barriteau.net/" },
"description": { "href": "/barriteau.net/description/" },
"keywords": { "href": "/barriteau.net/keywords/" },
"offering": { "href": "/barriteau.net/offering/" },
"seeking": { "href": "/barriteau.net/seeking/" },
"ol": { "href": "/barriteau.net/ol/" },
"primary_lang": { "href": "/barriteau.net/primary_lang/" },
"latitude": { "href": "/barriteau.net/latitude/" },
"longitude": { "href": "/barriteau.net/longitude/" },
"timestamp": { "href": "/barriteau.net/timestamp/" }
}
}
This request asks the Cipr to retrieve the list of languages matching with the q query parameter. The /languages/ endpoint is restricted to same-origin requests using the Sec-Fetch-Site header:
GET /languages/?q=Spanish HTTP/1.1
Host: ciprnode.cipr.info
Sec-Fetch-Site: same-origin
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
[
{
"lang_code": "es",
"lang_name": "Español",
"lang_name_en": "Spanish"
}
]
Use of the PUT method
A PUT request to {za} will add a new cipred resource to the Cipr if it doesn’t exist or update it if it does. The request body must be JSON and must contain at least all the required fields for a cipred resource. The response has no body; the outcome is indicated by the HTTP status code (202 Accepted for new entries, for idempotent updates or for self-insertions) and the Location header. Example:
PUT /guasa.art/ HTTP/1.1
Host: ciprnode.cipr.info
Content-Type: application/json; charset=utf-8
{
"za": "guasa.art",
"title": "La web de los ejemplos",
"description": "En esta web hay la la la",
"keywords": "perro gato loro",
"offering": "ejemplos gratis",
"seeking": null,
"primary_lang": "es",
"ol": null,
"latitude": 407128000,
"longitude": 407128000,
"timestamp": 1698417000
}
HTTP/1.1 202 Accepted
Location: /guasa.art/
Content-Length: 0
Before proceeding with the effective insertion/update of a PUTed entry in the ciprdup, a ciprnode must execute the Insertion Validation Sequence:
0. Currentness Validation: check that the value in the timestamp field is not older than 24 hours.
1. Ownership Validation: DNS query to check if the TXT record for the new cipred resource exists and is valid.
2. Availability Validation: HEAD / request to the https://ciprnode.{za} to check if the cipred resource is responding.
3. Reliability Validation: QUERY / to https://ciprnode.{za} to validate the correctness of the resource’s query results.
The insertion won’t be effective if at least one of those checks fails.
Use of the DELETE method
A DELETE request to {za} will remove a cipred resource from the Cipr if it exists. The response is always 202 Accepted with no body, regardless of whether the entry was actually deleted or the deletion was rejected because the node passed validation. Self-deletions are silently ignored. Example:
DELETE /example.com/ HTTP/1.1
Host: ciprnode.barriteau.net
HTTP/1.1 202 Accepted
Content-Length: 0
Before proceeding with the effective deletion of a DELETEd entry in the ciprdup, a ciprnode must execute the Deletion Validation Sequence:
1. Ownership Validation: DNS query to check if the TXT record for the new cipred resource exists and is valid.
2. Availability Validation: HEAD / request to the https://ciprnode.{za} to check if the cipred resource is responding.
3. Reliability Validation: QUERY / to https://ciprnode.{za} to validate the correctness of the resource’s query results.
The Reliability Validation requires the use of a random FTS expression and random pages[num] and random pages[size] query parameters, it also could reuse FTS expressions received from users of the ciprface, this implies that the ciprnode must be able to store and retrieve FTS expressions received from users of the ciprface.
The deletion of an entry won’t be effective if all of the three checks are successfully passed.
Use of the QUERY method
A QUERY / request must be able to receive the pages[num] and pages[size] query parameters, being num an array of integers (n) and/or ranges (n-m) indicating which page numbers are expected, and size an array of integers (n) indicating the expected number of entries per page. For example:
This queries the first page of search results with the number of entries defaulted in the ciprnode’s configuration:
QUERY / HTTP/1.1
Host: ciprnode.example.com
Content-Type: text/plain; charset=utf-8
Accept: application/x-www-form-urlencoded; charset=utf-8
query="FTS expression"
ol=[0,1,2,3]
geo_latitude=latitude
geo_longitude=longitude
geo_min_radius_km=radius
geo_max_radius_km=radius
before=timestamp
after=timestamp
This queries the fifth page of search results with the number of entries defaulted in the ciprnode’s configuration:
QUERY /?pages[num]=5 HTTP/1.1
Host: ciprnode.example.com
Content-Type: text/plain; charset=utf-8
Accept: application/x-www-form-urlencoded; charset=utf-8
query="FTS expression"
ol=[0,1,2,3]
geo_latitude=latitude
geo_longitude=longitude
geo_min_radius_km=radius
geo_max_radius_km=radius
before=timestamp
after=timestamp
This queries the first page of search results with 30 entries:
QUERY /?pages[size]=30 HTTP/1.1
Host: ciprnode.example.com
Content-Type: application/hal+json; charset=utf-8
Accept: application/hal+json; charset=utf-8
{
"query": "FTS expression",
"ol": [0,1,2,3],
"geo": {
"latitude": "latitude",
"longitude": "longitude",
"geo_min_radius_km": "radius",
"geo_max_radius_km": "radius"
},
"before": "timestamp",
"after": "timestamp",
"pages_num": [num],
"pages_size": [size]
}
This queries the first page of search results with 10 entries:
QUERY /?pages[num]=1&pages[size]=10 HTTP/1.1
Host: ciprnode.example.com
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Accept: application/x-www-form-urlencoded; charset=utf-8
query="FTS expression"
&ol=[0,1,2,3]
&geo_latitude=latitude
&geo_longitude=longitude
&geo_min_radius_km=radius
&geo_max_radius_km=radius
&before=timestamp
&after=timestamp
&pages_num=[num]
&pages_size=[size]
This queries the second, sixth and tenth pages of search results with 20 entries each:
QUERY /?pages[num]=[2,6,10]&pages[size]=[20] HTTP/1.1
Host: ciprnode.example.com
Content-Type: application/hal+json; charset=utf-8
Accept: application/hal+json; charset=utf-8
{
"query": "FTS expression",
"ol": [0,1,2,3],
"geo": {
"latitude": "latitude",
"longitude": "longitude",
"geo_min_radius_km": "radius",
"geo_max_radius_km": "radius"
},
"before": "timestamp",
"after": "timestamp"
"pages_num": [num],
"pages_size": [size]
}
This queries the fourth to eighth pages of search results with 10 entries each:
QUERY /?pages[num]=[4-8]&pages[size]=10 HTTP/1.1
Host: ciprnode.example.com
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Accept: application/x-www-form-urlencoded; charset=utf-8
query="FTS expression"
&ol=[0,1,2,3]
&geo_latitude=latitude
&geo_longitude=longitude
&geo_min_radius_km=radius
&geo_max_radius_km=radius
&before=timestamp
after=timestamp
&pages_num=[num]
&pages_size=[size]
This queries the eleventh to twentieth and the twenty-first to forty pages of search results with 10 entries the first group and 20 entries the second group:
QUERY /?pages[num]=[11-20,21-40]&pages[size]=[10,20] HTTP/1.1
Host: ciprnode.example.com
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Accept: text/html; charset=utf-8
HX-Request: true
query="FTS expression"
ol=[0,1,2,3]
geo_latitude=latitude
geo_longitude=longitude
geo_min_radius_km=radius
geo_max_radius_km=radius
before=timestamp
after=timestamp
&pages_num=[num]
&pages_size=[size]
Note the last one is asking for the results to be returned as HTML fragments instead of JSON.
Example responses to the above requests:
HTTP/1.1 200 OK
Content-Type: application/x-www-form-urlencoded; charset=utf-8
Date: Tue, 18 Feb 2026 10:09:00 GMT
Content-Length: 368
count=42
&pages[current]=1
&pages[total]=5
&results[0][za]=sub.example.com
&results[0][title]=FTS Expression Guide
&results[0][description]=A complete guide to Full Text Search expressions.
&results[0][timestamp]=1698417000
&results[1][za]=blog.example.com
&results[1][title]=My First FTS Post
&results[1][description]=Testing the expression engine.
&results[1][timestamp]=1698417055
HTTP/1.1 200 OK
Content-Type: application/hal+json; charset=utf-8
Date: Tue, 18 Feb 2026 10:09:00 GMT
Content-Length: 845
{
"_links": {
"self": { "href": "/?pages[num]=1" },
"first": { "href": "/?pages[num]=1" },
"last": { "href": "/?pages[num]=5" },
"next": { "href": "/?pages[num]=2" }
},
"count": 42,
"pages[num]": [1],
"pages[size]": [10],
"_embedded": {
"results": [
{
"za": "sub.example.com",
"title": "FTS Expression Guide",
"description": "A complete guide to Full Text Search expressions.",
"keywords": "fts query search",
"offering": null,
"seeking": null,
"ol": null,
"latitude": null,
"longitude": null,
"timestamp": 1698417000,
"primary_lang": "en",
"score": 12.5,
"lang_name": "English",
"lang_name_en": "English",
"_links": { "self": { "href": "/sub.example.com/" } }
},
{
"za": "blog.example.com",
"title": "My First FTS Post",
"description": "Testing the expression engine.",
"keywords": "blog post test",
"offering": null,
"seeking": null,
"ol": null,
"latitude": null,
"longitude": null,
"timestamp": 1698417055,
"primary_lang": "en",
"score": 8.2,
"lang_name": "English",
"lang_name_en": "English",
"_links": { "self": { "href": "/blog.example.com/" } }
}
]
}
}
HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Date: Tue, 18 Feb 2026 10:09:00 GMT
Content-Length: 512
HX-Trigger-After-Swap: update-pagination
A complete guide to Full Text Search expressions. Testing the expression engine.FTS Expression Guide
My First FTS Post