Run Your Own IPFS Search Engine With Lens
cryptocurrency·@rtrade·
0.000 HBDRun Your Own IPFS Search Engine With Lens
# Run Your Own IPFS Search Engine With Lens
[Lens](https://github.com/RTradeLtd/Lens) is another one of our open-source IPFS tools under the Temporal umbrella, allowing you to take content from IPFS, and index it to be searchable at a later date. Currently Lens can index the following mime-types:
* text/*
* image/*
* application/pdf
The one requirement is that all your data exists on IPFS, and is discoverable by the running Lens instance. In the future we may add support for other distributed networks, such as DAT or SWARM. To interact with Lens we have a simple, but robust gRPC API that supports both simple and complex queries.
## How Does Indexing Work
We have a few different methods of analyzing data that we’ll chain together. When given PDFs we first attempt to extract images and text from the pages. The text is fed into bleve which is capable of handling simple and complex search queries. The images are also analyzed, using a combination of Tesseract for optical character recognition to extract searchable text, and Tensorflow for rudimentary classification of images. When analyzing other mime types such as image/* we attempt to perform the same Tesseract, and image classification analysis as we do with images extracted from PDFs. When analyzing mime types like text/* we feed the text directly into bleve.
## How Does Searching Work
Searching at the most basic level consists of taking a query, ranging from single words like blockchain all the way up to search phrases like blockchain data storage. We also support more complex queries, like filtering against specific tags, categories, mime types, and more however these are entirely optional.
The response to your query is an array of documents that contains the IPFS hash of the content that matched your query, as well as the mimetype of the content, and a score displaying the relevance this content has to your search query.

## Installing Lens
There are a few different ways you can go about installing Lens, with the simplest way to be using our prebuilt Lens docker image. When using the docker image, the default setting is to start the gRPC server listening on 0.0.0.0:9998, without any encryption, and with a gRPC authentication key of blahblahblah. The docker container will also need a connection to an IPFS HTTP API, with the default being 127.0.0.1:5001. To install this docker image, run the following command docker pull rtradetech/lens:latest
Alternatively for those wanting a more hands off setup, we have a docker-compose setup that also spins up the required IPFS node. To use this docker-compose file, the following set of commands need to be run. These will use the /tmp directory as the base directory for storing all files in.
$> wget -O lens.yml https://raw.githubusercontent.com/RTradeLtd/Lens/master/lens.yml
$> LENS=latest BASE=/tmp docker-compose -f lens.yml up
## Using Lens
Before we get started with how you can use Lens, we’ve published the existing Lens index as seen on [https://temporal.cloud/lens](https://temporal.cloud/lens) via IPFS that can be downloaded via the CID [QmZqSYDQrtWg4LHnqT6DPqa1XUr7u4oeaGcyaTiGHJY3SR](https://gateway.temporal.cloud/ipfs/QmZqSYDQrtWg4LHnqT6DPqa1XUr7u4oeaGcyaTiGHJY3SR). It’s 1.2GB in size and contains a variety of research papers, crypto whitepapers, and I have submitted, as well as other user submitted documents.
All Indexing and Searching can be done via the gRPC API, for which we have published protocol buffers on [github](https://github.com/RTradeLtd/grpc/tree/master/lensv2). Using these you can build an API for Lens in *any* language that supports protocol buffers!
For an example of how we use those protocol buffers to build the Lens API client that is in Temporal, you can check out our Golang example below:
```Golang
package clients
import (
"fmt"
"github.com/RTradeLtd/config/v2"
"github.com/RTradeLtd/grpc/dialer"
pb "github.com/RTradeLtd/grpc/lensv2"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials"
)
const (
defaultURL = "127.0.0.1:9998"
)
// LensClient is a lens client used to make requests to the Lens gRPC server
type LensClient struct {
conn *grpc.ClientConn
pb.LensV2Client
}
// NewLensClient is used to generate our lens client
func NewLensClient(opts config.Services) (*LensClient, error) {
dialOpts := make([]grpc.DialOption, 0)
if opts.Lens.TLS.CertPath != "" {
creds, err := credentials.NewClientTLSFromFile(opts.Lens.TLS.CertPath, "")
if err != nil {
return nil, fmt.Errorf("could not load tls cert: %s", err)
}
dialOpts = append(dialOpts,
grpc.WithTransportCredentials(creds),
grpc.WithPerRPCCredentials(dialer.NewCredentials(opts.Lens.AuthKey, true)))
} else {
dialOpts = append(dialOpts,
grpc.WithInsecure(),
grpc.WithPerRPCCredentials(dialer.NewCredentials(opts.Lens.AuthKey, false)))
}
var url string
if opts.Lens.URL == "" {
url = defaultURL
} else {
url = opts.Lens.URL
}
conn, err := grpc.Dial(url, dialOpts...)
if err != nil {
return nil, err
}
return &LensClient{
conn: conn,
LensV2Client: pb.NewLensV2Client(conn),
}, nil
}
// Close shuts down the client's gRPC connection
func (l *LensClient) Close() { l.conn.Close() }
```
To actually index data, once you have your gRPC client up and running, all you need to do is called the Index command, and let Lens do its magic! Depending on where the content is in your network this process can take sometime. Generally speaking, if the content is locally available index analysis shouldn't ever take more than a minute, usually 30 seconds. When submitting data for indexing, you must provide two parameters, the ObjectType, which should be using the IndexReq_IPLDas defined in the protocol buffers. The second parameter is ObjectIdentifier which should be the IPFS hash of the content you want indexed.
Searching for data is extremely simple as well, and requires calling the Search command. The only required parameter is Query which defines how you want to search the data. Optionally you can filter out your search results even more with filters like Hashes to only match specific IPFS hashes, MimeTypes to only match specific mime types. The time it takes for this command to complete will depend on a wide variety of factors, such as the size of your index, the number of objects matched, the speed of your disk that the index resides on.
## Thank you and a big shout out to everyone contributing to IPFS and all the great work that is be done by many different projects!
RTrade’s online community, [Twitter](https://twitter.com/rtradetech) or [Telegram](https://t.me/rtradetech) and [website](http://www.rtradetechnologies.com/en/). Don’t forget to show Temporal some love on [Github](https://github.com/RTradeLtd)!
[v2.1.0 of Temporal is out! ](https://github.com/RTradeLtd/Temporal/releases/tag/v2.1.0)
Highlights of release:
- go-ipfs v0.4.20
- ipfs-cluster v0.10.1
- gomod support
**Temporal:** A versatile easy to use tool for companies with large amounts of data to secure, store and track. The platform can be used as is, or customarily built to manage and deploy blockchain-based applications and non-blockchain data-storage solutions for any enterprise.
**Temporal Features:**
If you don’t want to run your own Temporal installation you can use our hosted version[, Full Featured Pinning Service](https://temporal.cloud/) w/ Free 3GB/Monthly, 5 Free IPNS record creation a month, 100 Free pubSub messages a month and 5 Free IPFS keys
[Interface walk-through](https://medium.com/@rtradetech/temporal-cloud-walk-through-c477568be551)
[Full Service IPFS API](https://gateway.temporal.cloud/ipns/docs.api.temporal.cloud/)
[Temporal-JS SDK Full public IPFS and IPNS usage](https://github.com/clemlak/temporal-js)
[IPFS Gateway](https://gateway.temporal.cloud)
[I2P IPFS Gateway access](http://cdii3ou5mve5sfxyirs6kogt4tbvivk2d6o25awbcbazjrlhjeza.b32.i2p)
[Installing your own Temporal](https://rtradetechnologies.atlassian.net/wiki/x/U4JIAw)
Also the [Usages and Features](https://github.com/RTradeLtd/Temporal) section of the README.md doc on the GitHub repository covers using the docker compose file to spin up the environment.
### **Anything you build or use on our platform is NOT vendor locked-in. All software solutions currently available can be run in your own infrastructure simply by downloading our code off of github**
