Search engines
Meilisearch is neat together with their tokenizer lib they use. More practically DocSearch is great for plug and use solution. Tantivy, Quickwit & Edgesearch are interesting too.
Use Lyra for doing browser side searches. classes.wtf is great implementation of fast search.
Linksโ
- Algolia - Site Search & Discovery powered by AI.
- Toshi - Full-text search engine in rust.
- Sonic - Fast, lightweight & schema-less search backend. An alternative to Elasticsearch that runs on a few MBs of RAM. (Sonic Channel)
- CROKAGE: A New Way to Search Stack Overflow (2019)
- Bayard - Full-text search and indexing server written in Rust.
- MeiliSearch - Ultra relevant, instant and typo-tolerant full-text search API. (Web) (HN) (GitHub) (Awesome)
- MeiliSearch JavaScript Client
- MeiliSearch React - Integrate a front-end search bar in your React application using MeiliSearch.
- MeiliSearch Go - MeiliSearch API client written for Go.
- Hands on with: MeiliSearch - A next generation search engine for modern web (2020)
- The Joy of Search - Google Insider's Guide to Going Beyond the Basics.
- searX - Metasearch engine, aggregating the results of other search engines while not storing information about its users.
- Searx: moving away from DuckDuckGo (2021) (HN)
- DarkDarkGo - Modeled Google and Bing to build a distributed search engine for the dark web.
- How is search so bad? A case study (2020) (HN)
- Lobsters: How would one build a search engine today? (2020)
- Building a search engine from scratch (2019) - Whirlwind tour of the big ideas powering our web search.
- Typesense - Fast, typo tolerant search engine for building delightful search experiences. (Web) (HN) (HN)
- go-query - Blazingly fast query engine.
- Blast - Full text search and indexing server, written in Go, built on top of Bleve.
- Riot search - Go Open Source, Distributed, Simple and efficient full text search engine.
- YouTokenToMe - Unsupervised text tokenizer focused on computational efficiency.
- Milvus - Open source vector similarity search engine. (Web)
- PISA - Performant Indexes and Search for Academia. (HN)
- Apache Lucene - High-performance, full featured text search engine library written in Java. (Awesome)
- NNS Benchmark: Evaluating Approximate Nearest Neighbor Search Algorithms in High Dimensional Euclidean Space
- The Anatomy of a Large-Scale Hypertextual Web Search Engine: Sergey Brin and Lawrence Page (1998)
- Million Short - Search engine that lets you exclude top sites.
- Ask HN: Is there a search engine which excludes the world's biggest websites? (2020)
- Query Combinators (2017)
- Quickref - Experimental search engine for developers. Searches a curated subset of the web: official docs and community-driven sources. (Lobsters) (HN)
- Tantivy - Full-text search engine library inspired by Apache Lucene and written in Rust. (Article)
- sonar - Search engine based on tantivy with a Node.js frontend.
- How to make PageRank faster (with lots of math and a hint of Python) (2020)
- Writing a full-text search engine using Bloom filters (2013) (HN)
- Tinysearch - Tiny, full-text search engine for static websites built with Rust and WASM. (Article) (HN)
- Tinysearch-Go - Go based WASM Tiny Search inspired by Endler.dev.
- Elasticlunr.js - Lightweight full-text search engine in JavaScript for browser search and offline search. (Code)
- Zola - Can build a search index from the sections and pages content to be used by a JavaScript library such as elasticlunr.
- Creating a full-text search engine in Apache Pinot (2020) (HN)
- HN: Building a Search Engine for Programmers (2020)
- Search Commons - Open source project that maintains a directory of websites you can restrict your search to. (Code)
- Ego Graphs โ the Google โvsโ trick (2020) (HN)
- Reducing search indexing latency to one second (2020) (HN)
- Ecosia - Search engine that plants trees. (HN) (GitHub)
- paxx - Simple inverted index search engine.
- Sajari - AI-driven Search Solutions. (GitHub)
- Runnaroo - A Better Private Search Engine. (HN)
- Wiby - Search Engine for the Classic Web. (HN)
- Let's build a Full-Text Search engine (2020) (Lobsters) (HN) (Code)
- ScaNN: Efficient Vector Similarity Search (2020) (HN)
- Neeva - Search Reimagined. (HN)
- aPPR - Approximate Personalized Page Rank.
- Infinity Search - Open-source search engine. (Code) (HN)
- hndex.org - Full-text search engine of articles submitted to HN. (HN)
- Sourcegraph - Search public code.
- Autocomplete VS graph - Visualization of Google's autocomplete. (Code)
- Dorking: the use of search engines to find very specific data (2020) (HN)
- Create and use HTML full text search index (C++)
- Google Search tips
- Searching code with Sourcegraph (Lobsters)
- A Fast Fuzzy Search Implementation (2020)
- Using a search engine as a programmer (2020)
- Google's powerful code search tooling (2020)
- Full-Text Search Battle: PostgreSQL vs Elasticsearch (2020) (Lobsters)
- Lunr.js - Small, full-text search library for use in the browser. Indexes JSON documents and provides a simple search interface for retrieving documents that best match text queries. (Web)
- Linxy - Search engine which creates feeds based on multiple input search phrases. (HN)
- Awesome Algolia
- Write an Internet search engine with 200 lines of Ruby code (2009)
- The Search Engine Map
- Algolia Netlify plugin - Automatically index your website to Algolia when deploying your project to Netlify with the Algolia Crawler.
- How Google autocomplete predictions are generated (2020)
- Fist - Fast, lightweight, full-text search and index server. Fist stores all information in memory making lookups very fast while also persisting the index to disk. The index can be accessed over a TCP connection and all data returned is valid JSON.
- Towards an Understanding of Search Engines (2017)
- Search personal websites (2020)
- GoCrawler - Distributed web crawler implemented using Go, Postgres, RabbitMQ and Docker.
- From Then to Now: a Curated List for Neural Search and Jina (2020)
- CodeSearch - Search engine for code, written in Rust. (Article)
- InstantSearch.js - JavaScript library for building performant and instant search experiences with Algolia. (Docs) (React InstantSearch - Lightning-fast search for React and React Native applications)
- Unofficial Google Advanced Search
- Edgesearch - Build a full text search API using Cloudflare Workers and WebAssembly.
- fzy-lua - Lua port of fzy's fuzzy string matching algorithm.
- Firesearch - Serverless full-text search. For Google Cloud Platform. (Code)
- Qwant - Search engine that respects your privacy. (HN)
- YaCy - Distributed Peer-to-Peer Web Search Engine and Intranet Search Appliance. (Code)
- We can do better than DuckDuckGo (2020) (HN)
- PolyFuzz - Fuzzy string matching, grouping, and evaluation.
- Xapiand - Modern Highly Available Distributed RESTful Search and Storage Engine built for the Cloud and with Data Locality in mind. (Web)
- Xapian - Open Source Search Engine Library. Written in C++. (Web)
- NZBHydra - Meta search for newznab indexers and torznab trackers.
- Fast Autocomplete Search for Your Website (2018)
- Whoogle Search - Self hosted, ad-free, privacy-respecting Google metasearch engine. (Code) (HN)
- Mojeek - Alternative search engine that puts the people who use it first. Uses independent web spider. (HN)
- You.com - Private search engine that summarizes the web โ built for devs. (HN) (Tweet)
- Devbook - Search Engine for Developers. (HN)
- Building a Better Search Engine for Semantic Scholar (2020)
- Google's Search Quality Evaluation Guidelines
- Txtai - AI-powered search engine in Rust. (4.0)
- Jina - Cloud-native neural search framework for ๐๐ฃ๐ฎ kind of data.
- Okeano - Search engine that cleans the ocean and respects your privacy.
- Private.sh - Search engine that cryptographically protects your privacy.
- Wade - Blazing fast 1kb search library.
- Aves API - Insanely Fast Google Search API.
- Vespa - Engine for low-latency computation over large data sets. (Code) (Tweet)
- Google Search API - Python based API for searching google web, images, calc, and currency conversion.
- Knuth-Morris-Pratt string-searching algorithm: DFA-less version (HN)
- What GitHub Search Needs to Improve (2021) (HN)
- Lieu - Community search engine. (Code)
- Search with typo tolerance (2021)
- Same Energy - Visual Search Engine.
- Reiz.IO - Large Scale Structural Source Code Search. (Code)
- DataHub - Generalized Metadata Search & Discovery Tool. (Web)
- Milli - Search through millions of documents in milliseconds.
- Building a full-text search engine in 150 lines of Python code (2021) (In Rust)
- Google's Got A Secret (HN)
- Evaluating Search Algorithms (2021) (HN)
- Portal - Full Text Search Web Service.
- Vald - Highly Scalable Distributed Vector Search Engine. (Code) (HN)
- Awesome Vector Search Engine
- OpenGameArt Search + Reverse Image Search - Reverse image search for pixel art. (Code) (HN)
- RapidFuzz - Rapid fuzzy string matching in Python using various string metrics.
- Dedupe Python Library - Python library that uses machine learning to perform fuzzy matching, deduplication and entity resolution quickly on structured data.
- Weaviate - Cloud-native, modular, real-time vector search engine. (Docs) (Awesome)
- Internet Search Tips - Description of advanced tips and tricks for effective Internet research of papers/books.
- Get better at Googling (2021) (HN)
- Metasearch - Search aggregator for Slack, Google Docs, GitHub, and more.
- Quickwit - Big data search engine. (Web) (0.1 release)
- How image search works at Dropbox (2021) (HN)
- Billion-Scale Approximate Nearest Neighbor Search (2020)
- Recommendations and Results Organization in Netflix Search (2021)
- Full-Text Search PostgreSQL or ElasticSearch (2021)
- Ask HN: Has Google search got noticably worse recently? (2021)
- Pinecone - Managed Vector Similarity Search.
- Extreme Classification with Vector Similarity Search
- Brave Search - Private search. (HN) (HN)
- System Design for Recommendations and Search (2021)
- How not to break a search engine (2021)
- Writing an Image Search Engine from Scratch (Code)
- Monocle - Universal, personal search engine. (Web) (Article)
- Algolia Recommend - ML Product Recommender systems & APIs.
- Apollo - Unix-style personal search engine and web crawler for your digital footprint.
- APSE - Personal Search Engine. (HN)
- Time Series Similarity Search
- How MDN's autocomplete search works (2021) (HN)
- Evolution of Search Engines Architecture โ Algolia Search Architecture (2021) (HN) (HN)
- Code search guide - Everything you ever wanted to know about code search. (Code)
- Aleph - Search and browse documents and data; find the people and companies you look for. (Docs)
- MeiliSearch: A Minimalist Full-Text Search Engine (2021) (HN)
- Custom Search Engine Built on Searx (2021) (HN)
- Marginalia Search - Search engine that favors text-heavy sites and punishes modern web design. (HN) (1 Year Later) (HN)
- Dorks collections list - List of Github repositories and articles with list of dorks for different search engines.
- Nrtsearch: Yelpโs Fast, Scalable and Cost Effective Search Engine (2021)
- MacroBase - Data analytics tool that prioritizes attention in large datasets using machine learning. (Web)
- Open Guide to Search Engineering - Want to build or improve a search experience? Start here.
- site-search - Lightweight self-hosted alternative to DocSearch. Will run your website locally, crawl its pages and index their content in a lunr index.
- exaly Search Engine - Comprehensive scholarly search engine. Similar to Google Scholar.
- Tips for efficiently Googling
- Awesome Semantic-Search
- Awesome Search Engine Optimization Ideas
- IndieWeb Search - Search web sites published by members of the IndieWeb community and related sites. (Code)
- Felvin Search - Your search box is now an app store. (Code)
- MiniSearch - Tiny and powerful JavaScript full-text search engine for browser and Node.
- Search Engine Parser - Package that lets you query popular search engines and scrape for result titles, links, descriptions and more.
- What every software engineer should know about search (2017) (HN)
- IndexNow - Easy way for websites owners to instantly inform search engines about latest content changes on their website. (Article) (Article)
- typesense-js - JavaScript / TypeScript client for Typesense.
- Wikipedia search engine
- googler - Google from the terminal.
- DuckDuckGo as a TTY (HN)
- Recoll - Desktop full-text search tool. (HN)
- Qdrant - Vector Search Engine. (Code) (GitHub) (Twitter)
- Trie in JavaScript: the data structure behind autocomplete (2021) (HN)
- We need more boutique search engines (2021) (HN)
- liqe - Lightweight and performant Lucene-like parser and search engine.
- Zoekt - Fast trigram based code search.
- How Not To Sort By Average Rating (2009) (HN)
- T-Wand: Beat Lucene in Less Than 600 Lines of Code (2021) (HN)
- In-memory, full-text search engine built in Go
- SymSpell - Spelling correction & Fuzzy search: 1 million times faster through Symmetric Delete spelling correction algorithm. (HN)
- SeekStorm - Affordable high-performance search API.
- UIRecord - UI for managing your meilisearch instances.
- lnx - Ultra-fast, adaptable deployment of the tantivy search engine via REST. (Web)
- Occamm - Search engine that lets you refine your queries. (HN)
- Ask HN: Why doesn't anyone create a search engine comparable to 2005 Google? (2021)
- Gigablast - Alternative Web Search Engine. (HN)
- Semantic search through Wikipedia with the Weaviate vector search engine
- Disco CLI - Generate recommendations from CSV files.
- Natural Language Processing (NLP) for Semantic Search (HN)
- Find anything fast with Google's vector search technology (2021) (HN)
- Aquila DB - Easy to use Neural Search Engine. Index latent vectors along with JSON metadata and do efficient k-NN search.
- Phalanx - Cloud-native distributed search engine written in Go built on top of Bluge that provides endpoints through gRPC and traditional RESTful API.
- Mwmbl - Open source, non-profit search engine implemented in python. (Code) (HN)
- Google no longer producing high quality search results (2022) (HN)
- Kagi - Premium Search Engine. (HN)
- Cherche - Allows you to create a neural search pipeline using retrievers and pre-trained language models as rankers.
- Jina AI - Neural Search Company. (GitHub)
- Pixie: A System for Recommending 3+ Billion Items to 200+ Million Users in Real-Time (2017)
- Customizing web search
- A Gentle Intro to Vector Search for Developers
- The web starts on page four (2022) (HN)
- Google Search Is Dying (2022) (HN) (Reddit)
- Goopt - Search Engine for a Procedural Simulation of the Web with GPT-3. (HN)
- Awesome Search - All about the (e-commerce) search and its awesomeness.
- Virtualpaper - Document management system with full-text-search.
- couch - Simple full text search engine written in Go.
- Tantiny - Tiny full-text search for Ruby powered by Tantivy.
- Searchengines.guru
- Alexandria Search (HN) (Code)
- A look at search engines with their own indexes (2021)
- If Google sucks then why is everyone still using it? (HN)
- Teclis - Non-commercial Web Search. (HN)
- Ask HN: How do you search large code-base before adding a feature or fixing bug? (2022)
- What I Learned From Running a Concierge Search Engine (2022) (HN)
- Lucene: The Good Parts (2015)
- Andi - Q&A based, ad-free, anti-spam search engine. (HN)
- I Built A Snappy Static Full-text Search with WebAssembly, Rust, Next.js, and Xor Filters (2022)
- Instant Meilisearch - Search client to use Meilisearch with InstantSearch.
- ndx - Full text indexing and searching library.
- The Next Google (2022) (HN)
- Groonga - Open-source full text search engine and column store. (Code)
- Three areas where Google Search lags behind competitors: code, cooking, travel (2022) (HN)
- Giggle - Self-hosted, customizable and ad-free Google Search experience. (HN)
- Fast Autocomplete Search for Your Website
- Spyglass - Personal, self-hosted search engine.
- Susper - Decentralized Search Engine that uses the peer to peer system YaCy and Apache Solr to crawl and index search results. (Code)
- Google This - Simple yet powerful module to retrieve organic search results and much more from Google.
- How to block domains from search results (2022) (HN)
- SEAL - Search Engines with Autoregressive Language models.
- Oldest Search - Search for the oldest result on internet.
- Search My Site - Open source search engine and search as a service for personal and independent websites. (Code)
- The future of search is boutique (2022) (HN)
- Meilisearch Python - Meilisearch API client for Python developers.
- SPLADE: sparse neural search
- Lyra - Fast, in-memory, full-text search engine written in TypeScript. (HN) (Disk Persistence Plugin)
- Presearch - Decentralized Search Engine. (Search)
- How to declutter Google's search results page (Lobsters)
- A look at search engines with their own indexes (2021) (HN)
- Learning-To-Rank: Sorting Search Results (2022)
- Brave Search Goggles - Alter search rankings with rules and filters. (HN)
- Awesome Hacker Search Engines
- Similari - In-memory similarity search engine with parallellized data processing.
- SearXNG - Free internet metasearch engine which aggregates results from various search services and databases. Users are neither tracked nor profiled. (Docs)
- Build your own Search Engine (2022) (Code) (HN)
- Hello - Conversational Search.
- SearchHut - Curated free software search engine. (Code) (HN)
- Mood Board Search - Train a computer to recognize visual concepts using mood boards and machine learning.
- Pagefind - Fully static search library that aims to perform well on large sites, while using as little of your users' bandwidth as possible. (Docs) (Lobsters)
- libsearch - Simple, index-free full-text search for JavaScript.
- DuckDuckScrape - Search from DuckDuckGo and utilize its spice APIs for things such as stocks, weather, currency conversion and more.
- minisearch - Tiny search engine. Suitable for in-browser use, this provides n-gram based, English search results.
- Spreading vectors for similarity search
- Course catalog with extremely fast full-text search
- How codesearch.ai works - Sourcegraph