Introduction

This document provides a detailed guide on setting up the MongoDB source connector for the DataBrew project. The MongoDB connector allows for efficient data integration from MongoDB collections into your DataBrew data pipeline, ensuring a seamless data flow from your MongoDB databases.

MongoDB connector in DataBrew supports both types of streaming, meaning it can be either source or destination

Requirements

Before setting up the MongoDB source connector, ensure you meet the following requirements:

  • Access to your MongoDB database.
  • Necessary permissions to read data from the MongoDB collections intended for use in your pipeline.
  • Understanding of MongoDB’s data model and query language to accurately configure the data extraction.

Preparing Your MongoDB Database

To ensure smooth integration:

  • Indexing: Ensure proper indexing on your collections to optimize query performance.
  • User Permissions: Create a database user for the DataBrew project with read access to the necessary collections.

Cloud Setup

This section guides you through setting up the MongoDB source connector in the cloud for the DataBrew project.

Setting up in Cloud

  1. Access DataBrew Cloud Platform: Navigate to DataBrew Cloud App.
  2. Create a New Source Connector Instance: Follow these steps…
    • Step 1: Choose ‘MongoDB’ from the list of available source connectors.
    • Step 2: Provide the necessary connection details, including your MongoDB host, database name, and credentials.
    • Step 3: Configure the connector by specifying the collections and any query filters if required.
    • (Include screenshots or code snippets if necessary)

Source connector limitations

Be aware, DataBrew can’t automatically resolve the nature of the data inside the MongoDB Collections. That means you will be asked to provide a data sample to generate a compatible schema.

If your schema is not sonsistent - your data may not be processed