Revolutionising the Ingestion Chain

pipeline product map image

All your Extraction, Transformation and Load needs

RAVN Pipeline delivers a platform approach to all your Extraction, Transformation and Load (ETL) needs. Whether you’re dynamically ingesting content into a search engine, connecting and ingesting sources to Microsoft SharePoint, or perhaps just performing a data migrating exercise after a merger or acquisition, RAVN Pipeline provides the assurances both the IT department and the business demand, that can be successfully and reliably completed.
tech image


See this product in action


tech image

Competitive Advantage

The fully flexible staged approach to ingestion that RAVN Pipeline provides, allows businesses to perform complex extraction, transformation and enhancement of content as it is ingested. Aside from the obvious benefits of having content rapidly and robustly ingested and made available to your business, the ability to enhance content through the application of business rules or fuzzy logic (especially when combined with RAVN Core and RAVN Linking engines) could provide the additional insight that makes all the difference.


The repeatable and auditable nature of the RAVN Pipeline ingestion chain provides confidence that content that should be part of a data corpus has actually been ingested correctly. Or if it hasn’t, where and why it has failed. Too often, legacy approaches to data connectivity provide little or no audit trail when ingestion fails, or even statistics when it is successful. With RAVN Pipeline, you can be sure of the status of all connectivity and ingestion jobs, allowing you to take appropriate remedial action, or simply to be able to report reliably on success.

Eliminate Dependence on Third Party Vendors

The cost and potential delays of such dependencies can be prohibitive and limit your ability as an organisation to respond to inevitable changes. By adopting our user driven approach, your IT function can quickly and easily affect changes to existing connections (perhaps reflecting changes in the structure of those content sources at upgrade time), or add additional sources via the graphical user interface (GUI), without recourse to third party vendors. By doing so, the efficacy of potentially mission critical applications is not compromised. In support of efficiency, pre-defined ingestion configurations can be deployed or subtly re-purposed with minimal effort, to save time.

Crawling and Fetching

RAVN Pipeline comes with a series of built-in connectors allowing you to crawl and fetch content from the different information sources that matter to your organisation. Whether your valuable information resides in a recent or legacy Enterprise system or outside of your organisation on the Internet, RAVN Pipeline is the ideal product to connect and fetch it. Ranging from the typical well-known Enterprise systems such as: Microsoft SharePoint, Elasticsearch, shared drives, FTP / SFTP drives, virtually any Document Management System (DMS), Alfresco, etc. to the Internet and Social Media sources such as: Web sites, REST APIs, Twitter, Youtube, Facebook, etc, RAVN Pipeline comes with out of the box connectors for each one of them. It is also possible to go beyond the standard offering and develop your own connector. Pipeline also supports a push API which allows you to push your own content; unlocking an unlimited number of creative solutions which can be built.
crawling and fetching pipeline image

Unlocking Information

RAVN Pipeline unlocks the text and metadata from your documents and information repositories. The connectors can retrieve all the relevant information from your data repositories. Additionally, Pipeline is able to extract the text and metadata from hundreds of different file formats, ranging from the commonly used Microsoft Office documents and PDFs to legacy document formats such as Wordperfect and DBase. Using a pluggable document filter mechanism it is also possible to write your own document filters and submit those to the RAVN and the Pipeline community to be included in future releases.
connector image

Combining Data Sources

Using the flexible and configurable index job pipelines, it is possible to conceive and implement the most creative data transformation pipes in a matter of minutes. One typical use case of the staged jobs is the combination of information from different sources. By using Pipeline it is a trivial exercise to, for instance, crawl project documents from a shared drive, enrich the documents with metadata from a legacy project management system based on a SQL server, call out to a REST service to get project manager information and finally to index into the corporate Microsoft SharePoint 2013 or 2016 Search.
data souces pipeline webpage

Targeted Website Crawling

Web scraping is a useful technique to convert the unstructured information available on the web into a structured version. Using web scraping it is for instance possible to crawl and retrieve property postings from various sources and build a powerful property search engine. Using the built-in web filters, Web scraping is made very easy to configure and use. With the web based interface, scraping rules can be configured and tested. Specific parts of the crawled web pages can easily be loaded into a search index or a SQL database.

targetted crawling pipeline webimage

Agile Control

Controlled and configured through a graphical web user interface, removing the need to have specialist knowledge of local operating systems, text editors or data source specific configuration editors. By moving the configuration responsibility from a developer role to an administrator or even business user, we allow you to seriously reduce the time it takes to make changes in your ETL processes. On the other hand, aside from the series of standard plug and play stages provided, a web based scripting stage allows you to go beyond the functionality that is provided out of the box.

preview agile web image

Intelligent Data Processing

The data extraction process is controlled via various data tracking mechanisms that ensure incremental indexing, where only new or altered data is processed.

intelligent data pipeline


The architecture allows many instances to be distributed for scaling and load balancing purposes, in support of Big Data type environments or applications where the volumes of content and the speed of processing is important.

connector image

Job Based Configuration

Once a Pipeline job has been created it is trivial to reuse it for a different data source, drastically reducing implementation times.

job configuration pipeline

Extend Pipeline Using its API

Using the Pipeline REST API, it is possible to control, configure and monitor Pipeline and the different jobs configured. It is also to build custom applications which push data through the Pipeline. The data transformation and load possibility become virtually unlimited using the data push API.

connector image
pipeline brochure image

Need more information?

Read our Pipeline Brochure

cowi case study image

Latest Case Study

Read our latest customer experience showing how our products have benefited their organisation

    As a trusted adviser and negotiation partner, we wanted to have first hand experience of a leading AI technology so we can advise on how it will change the workplace over the coming years to ensure we’re offering the most appropriate advice to our clients. 
    Ole Møller, Vice President at Djøf
    This technology will bring us a flexible environment that is dedicated to our firm so we can ensure we can look after our clients as efficiently as possible. We chose to collaborate with RAVN as we recognised them as a leading AI provider and wanted their product to be part of the firm’s portfolio. 
    Santiago Gómez Sancha, ICT Director at Uría Menéndez
    For our lawyers, time is very precious and we needed a fast, reliable and accurate search engine that was easily integrated into our existing systems. The team at RAVN proved they could tick all the boxes we required. 
    Flavio Romerio, Partner at Homburger
    The software will read, interpret and extract key provisions from a client’s property lease agreements. This approach is a great supplement to manually laborious processes, and a stand-alone device in relation to certain standardized agreements and will mitigate risk from human errors and inconsistencies. SVW is happy to continue the collaboration with RAVN to further improve the SVW real estate robot. 
    Peter Van Dam, Knowledge and IT Manager at Simonsen Vogt Wiig
    Garrigues are always looking for innovative ways to ensure we are providing the most efficient service to our clients. We chose to work with RAVN as they are leaders in the industry and able to deliver on both our current and future plans. 
    César Mejias, IT Director at Garrigues