RocketMQ Connect in Action 4

SFTP Server (File Data) -> RocketMQ Connect -> SFTP Server (File)

Preparation

Start RocketMQ

  1. Linux/Unix/Mac
  2. 64bit JDK 1.8+;
  3. Maven 3.2.x+;
  4. Start RocketMQ. Either RocketMQ 4.x or RocketMQ 5.x 5.x version can be used;
  5. Test RocketMQ message sending and receiving using the tool.

Here, use the environment variable NAMESRV_ADDR to inform the tool client of the NameServer address of RocketMQ as localhost:9876.

  1. #$ cd distribution/target/rocketmq-4.9.7/rocketmq-4.9.7
  2. $ cd distribution/target/rocketmq-5.1.4/rocketmq-5.1.4
  3. $ export NAMESRV_ADDR=localhost:9876
  4. $ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Producer
  5. SendResult [sendStatus=SEND_OK, msgId= ...
  6. $ sh bin/tools.sh org.apache.rocketmq.example.quickstart.Consumer
  7. ConsumeMessageThread_%d Receive New Messages: [MessageExt...

Note: RocketMQ has the feature of automatically creating Topic and Group. When sending or subscribing to messages, if the corresponding Topic or Group does not exist, RocketMQ will automatically create them. Therefore, there is no need to create Topic and Group in advance.

Build Connector Runtime

  1. git clone https://github.com/apache/rocketmq-connect.git
  2. cd rocketmq-connect
  3. export RMQ_CONNECT_HOME=`pwd`
  4. mvn -Prelease-connect -Dmaven.test.skip=true clean install -U

Build SFTP Connector Plugin

  1. cd $RMQ_CONNECT_HOME/connectors/rocketmq-connect-sftp/
  2. mvn clean package -Dmaven.test.skip=true

Put the compiled jar of the SFTP RocketMQ Connector into the Plugin directory for runtime loading.

  1. mkdir -p /Users/YourUsername/rocketmqconnect/connector-plugins
  2. cp target/rocketmq-connect-sftp-0.0.1-SNAPSHOT-jar-with-dependencies.jar /Users/YourUsername/rocketmqconnect/connector-plugins

Run Connector Worker in Standalone Mode

Modify the connect-standalone.conf file to configure the RocketMQ connection address and other information.

  1. cd $RMQ_CONNECT_HOME/distribution/target/rocketmq-connect-0.0.1-SNAPSHOT/rocketmq-connect-0.0.1-SNAPSHOT
  2. vim conf/connect-standalone.conf

Example configuration information is as follows:

  1. workerId=standalone-worker
  2. storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot
  3. ## Http port for user to access REST API
  4. httpPort=8082
  5. # Rocketmq namesrvAddr
  6. namesrvAddr=localhost:9876
  7. # RocketMQ acl
  8. aclEnable=false
  9. #accessKey=rocketmq
  10. #secretKey=12345678
  11. clusterName="DefaultCluster"
  12. # Plugin path for loading Source/Sink Connectors
  13. pluginPaths=/Users/YourUsername/rocketmqconnect/connector-plugins

In standalone mode, RocketMQ Connect persistently stores the synchronization checkpoint information in the local file directory specified by storePathRootDir.

storePathRootDir=/Users/YourUsername/rocketmqconnect/storeRoot

If you want to reset the synchronization checkpoint, you need to delete the persisted checkpoint information files.

  1. rm -rf /Users/YourUsername/rocketmqconnect/storeRoot/*

To start Connector Worker in standalone mode:

  1. sh bin/connect-standalone.sh -c conf/connect-standalone.conf &

Set up an SFTP server

SFTP (SSH File Transfer Protocol) is a file transfer protocol used for secure file transfers between computers. SFTP is built on top of the SSH (Secure Shell) protocol and utilizes encryption and authentication.

We will use the built-in SFTP service in macOS (by enabling “Remote Login” access). For detailed instructions, please refer to the Allow a remote computer to access your Macdocument.

Create Source Test File

Create a test file named source.txt and write some test data to it:

  1. mkdir -p /Users/YourUsername/rocketmqconnect/sftp-test/
  2. cd /Users/YourUsername/rocketmqconnect/sftp-test/
  3. touch source.txt
  4. echo 'John Doe|100000202211290001|20221129001|30000.00|2022-11-28|03:00:00|7.00
  5. Jane Smith|100000202211290002|20221129002|40000.00|2022-11-28|04:00:00|9.00
  6. Bob Johnson|100000202211290003|20221129003|50000.00|2022-11-28|05:00:00|12.00' >> source.txt

Log in to the SFTP service to verify that you can access it normally. Enter the following command, then enter your password :

  1. # sftp -P port YourUsername@hostname
  2. sftp -P 22 YourUsername@127.0.0.1

Note: Since this is the SFTP service provided by your local MAC OS, the address is 127.0.0.1 and the port is the default 22.

  1. sftp> cd /Users/YourUsername/rocketmqconnect/sftp-test/
  2. sftp> ls source.txt
  3. sftp> bye

Start Connector

Start SFTP Source Connector

Run the following command to start the SFTP source connector. This connector will connect to the SFTP service to read from the source.txt file. For each line of text in the file, the connector will parse and package the contents into a generic ConnectRecord object, which will then be sent to a RocketMQ topic for consumption by sink connectors.

  1. curl -X POST --location "http://localhost:8082/connectors/SftpSourceConnector" --http1.1 \
  2. -H "Host: localhost:8082" \
  3. -H "Content-Type: application/json" \
  4. -d '{
  5. "connector.class": "org.apache.rocketmq.connect.http.sink.SftpSourceConnector",
  6. "host": "127.0.0.1",
  7. "port": 22,
  8. "username": "YourUsername",
  9. "password": "yourPassword",
  10. "filePath": "/Users/YourUsername/rocketmqconnect/sftp-test/source.txt",
  11. "connect.topicname": "sftpTopic",
  12. "fieldSeparator": "|",
  13. "fieldSchema": "username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit"
  14. }'

If the curl request returns status: 200, it indicates that the connector was successfully created. An example response would look like this:

  1. {"status":200,"body":{"connector.class":"...

To confirm that the file source connector has started successfully, run the following command:

  1. tail -100f ~/logs/rocketmqconnect/connect_runtime.log

Start connector SftpSourceConnector and set target state STARTED successed!!

Start SFTP Sink Connector

Run the following command to start the SFTP sink connector. This connector will subscribe to the RocketMQ topic to consume messages and convert each one into a single line of text, which will then be written to the destination file sink.txt using the SFTP protocol:

  1. curl -X POST --location "http://localhost:8082/connectors/SftpSinkConnector" --http1.1 \
  2. -H "Host: localhost:8082" \
  3. -H "Content-Type: application/json" \
  4. -d '{
  5. "connector.class": "org.apache.rocketmq.connect.http.sink.SftpSinkConnector",
  6. "host": "127.0.0.1",
  7. "port": 22,
  8. "username": "YourUsername",
  9. "password": "yourPassword",
  10. "filePath": "/Users/YourUsername/rocketmqconnect/sftp-test/sink.txt",
  11. "connect.topicnames": "sftpTopic",
  12. "fieldSeparator": "|",
  13. "fieldSchema": "username|idCardNo|orderNo|orderAmount|trxDate|trxTime|profit"
  14. }'

If the curl request returns status: 200, it indicates that the connector was successfully created. An example response would look like this:

  1. {"status":200,"body":{"connector.class":"...

Check the logs to confirm successful startup of the SFTP sink connector:

  1. tail -100f ~/logs/rocketmqconnect/connect_runtime.log

Start connector SftpSinkConnector and set target state STARTED successed!!

Confirm that the data has been written to the destination file by running the following command:

  1. cat /Users/YourUsername/rocketmqconnect/sftp-test/sink.txt

If the sink.txt file has been generated and its contents match those of the source.txt file, the entire process is working correctly.

Write more test data to the source.txt file to continue testing:

  1. cd /Users/YourUsername/rocketmqconnect/sftp-test/
  2. echo 'John Doe|100000202211290001|20221129001|30000.00|2022-11-28|03:00:00|7.00
  3. Jane Smith|100000202211290002|20221129002|40000.00|2022-11-28|04:00:00|9.00
  4. Bob Johnson|100000202211290003|20221129003|50000.00|2022-11-28|05:00:00|12.00' >> source.txt
  5. # Wait a few seconds to give the connector time to replicate data to the sink file.
  6. sleep 10
  7. cat /Users/YourUsername/rocketmqconnect/sftp-test/sink.txt

Note: The order of file contents may vary because the rocketmq-connect-sftp uses normal message when sending and receiving messages to/from a RocketMQ topic. This is different from ordered message, and consuming normal messages does not guarantee the order.