Basic Concepts

The following are some basic concepts that need to be understood during the development process:

  • Transform SQL functions, including arithmetic functions (such as abs, power), time functions (such as localtime, date_format), string functions (such as locate, translate), etc. Functions generally have one or more parameters, and their function is to perform some transformation operation on the input data, and then output the transformed result.
  • Transform SQL parser, there are mainly two types of parsers, one is the parser class for type, which is used to convert the original data into the corresponding type object, such as DateParser can convert the input data into a Date object in Java, which is convenient for further conversion operations; The other is the parser class for calculation expressions, which is used to perform certain calculation operations on the converted original data and output the calculation result (similar to a function), such as AdditionParser can parse the part like a + b in SQL statements and output the corresponding result.
  • Transform SQL operators, mainly some logical operators, such as (and, or, not), etc., to implement some logical judgment operations, and the output result is a Boolean value.

Function Development

This section introduces how to expand a new function.

Create Function Class File

The function implementation class is stored in this directory. After determining the function you want to expand, create a new class in this directory, and the class name consists of function name + Function, such as AbsFunction.

Basic Code Framework Construction

After creating the class, build the basic framework of the code, taking AbsFunction as an example:

  1. /**
  2. * AbsFunction
  3. * description: abs(numeric)--returns the absolute value of numeric
  4. */
  5. @TransformFunction(names = {"abs"})
  6. public class AbsFunction implements ValueParser {
  7. @Override
  8. public Object parse(SourceData sourceData, int rowIndex, Context context) {
  9. }
  10. }

Add corresponding class comments and @TransformFunction annotation for the function. The function needs to implement the ValueParser interface and override the parse method in the interface.

Add Constructor and ValueParser Object

Add a parameterized constructor and related ValueParser member variables to the function. In the constructor, parse the function expression and initialize the parameter parser object. Taking AbsFunction as an example:

  1. private ValueParser numberParser;
  2. public AbsFunction(Function expr) {
  3. numberParser = OperatorTools.buildParser(expr.getParameters().getExpressions().get(0));
  4. }

The number of ValueParser objects is the same as the number of function parameters.

Function Implement

Override the parse method, parse the parameters and implement the function logic, and calculate the function return value. Taking AbsFunction as an example:

  1. @Override
  2. public Object parse(SourceData sourceData, int rowIndex, Context context) {
  3. Object numberObj = numberParser.parse(sourceData, rowIndex, context);
  4. BigDecimal numberValue = OperatorTools.parseBigDecimal(numberObj);
  5. return numberValue.abs();
  6. }

Add Unit Test Code

Each function needs to pass unit tests to verify whether the function logic is correct. The unit test class is located in this directory. All unit test functions for each function are placed in the same unit test class, and the unit test class is named in the format of Test + function name + Function, taking testAbsFunction() as an example:

  1. @Test
  2. public void testAbsFunction() throws Exception {
  3. String transformSql = "select abs(numeric1) from source";
  4. TransformConfig config = new TransformConfig(transformSql);
  5. // case1: |2|
  6. TransformProcessor<String, String> processor = TransformProcessor
  7. .create(config, SourceDecoderFactory.createCsvDecoder(csvSource),
  8. SinkEncoderFactory.createKvEncoder(kvSink));
  9. List<String> output1 = processor.transform("2|4|6|8", new HashMap<>());
  10. Assert.assertEquals(1, output1.size());
  11. Assert.assertEquals(output1.get(0), "result=2");
  12. // case2: |-4.25|
  13. List<String> output2 = processor.transform("-4.25|4|6|8", new HashMap<>());
  14. Assert.assertEquals(1, output2.size());
  15. Assert.assertEquals(output2.get(0), "result=4.25");
  16. }

After the above steps, congratulations on completing the implementation of a new function, and you can submit your code to the community. The complete code of AbsFunction can be seen at code link

Here are some precautions:

  • Some function parameters can be NULL. Pay attention to the parsing logic for NULL objects in the parse function to prevent NullPointerException.
  • The function name in the @TransformFunction annotation can have multiple names, as long as it follows the naming conventions of various databases.
  • Some functions have a variable number of parameters. Be careful to prevent IndexOutOfBoundsException when constructing ValueParser.
  • Please cover as many situations as possible in unit tests, such as using different numbers of parameters, setting parameters to NULL, etc., to ensure that the function can output correct results under different circumstances.

Parser Development

This section introduces how to expand a new parser class.

Create Parser Class File

Parsers are stored in this directory. After determining the parser you want to expand, create a new class in this directory, and the class name consists of type + Parser, such as AdditionParser.

Basic Code Framework Construction

After creating the class, build the basic framework of the code, taking AdditionParser as an example:

  1. /**
  2. * description: calcute a + b
  3. */
  4. @TransformParser(values = Addition.class)
  5. public class AdditionParser implements ValueParser {
  6. @Override
  7. public Object parse(SourceData sourceData, int rowIndex, Context context) {
  8. }
  9. }

Add the corresponding @TransformParser annotation to the parser class. Type parser classes need to implement the ValueParser interface and override the parse method in the interface.

Add Constructor and Member Variables

Add a parameterized constructor and related member variables to the parser class. In the constructor, parse the input expression and convert it into the corresponding type object. Taking AdditionParser as an example:

  1. private final ValueParser left;
  2. private final ValueParser right;
  3. public AdditionParser(Addition expr) {
  4. this.left = OperatorTools.buildParser(expr.getLeftExpression());
  5. this.right = OperatorTools.buildParser(expr.getRightExpression());
  6. }

Parsing Implement

Override the parse method. If the parser needs to perform further processing on the type object parsed in the previous step, you can implement the corresponding processing logic in this method. Otherwise, just return the type object parsed in the previous step directly. Taking AdditionParser as an example:

  1. @Override
  2. public Object parse(SourceData sourceData, int rowIndex, Context context) {
  3. if (this.left instanceof IntervalParser && this.right instanceof IntervalParser) {
  4. return null;
  5. } else if (this.left instanceof IntervalParser || this.right instanceof IntervalParser) {
  6. IntervalParser intervalParser = null;
  7. ValueParser dateParser = null;
  8. if (this.left instanceof IntervalParser) {
  9. intervalParser = (IntervalParser) this.left;
  10. dateParser = this.right;
  11. } else {
  12. intervalParser = (IntervalParser) this.right;
  13. dateParser = this.left;
  14. }
  15. Object intervalPairObj = intervalParser.parse(sourceData, rowIndex, context);
  16. Object dateObj = dateParser.parse(sourceData, rowIndex, context);
  17. if (intervalPairObj == null || dateObj == null) {
  18. return null;
  19. }
  20. return DateUtil.dateAdd(OperatorTools.parseString(dateObj),
  21. (Pair<Integer, Map<ChronoField, Long>>) intervalPairObj, 1);
  22. } else {
  23. return numericalOperation(sourceData, rowIndex, context);
  24. }
  25. }

Add Unit Test Code

Each parser class needs to pass unit tests to verify whether the logic is correct. The unit test class is located in this directory. All unit test functions for each parser are placed in the same unit test class, and the unit test class is named in the format of Test + Parser Name + Parser, taking TestAdditionParser as an example:

  1. @Test
  2. public void testAdditionParser() throws Exception {
  3. String transformSql = null;
  4. TransformConfig config = null;
  5. TransformProcessor<String, String> processor = null;
  6. List<String> output = null;
  7. transformSql = "select numeric1 + numeric2 from source";
  8. config = new TransformConfig(transformSql);
  9. processor = TransformProcessor
  10. .create(config, SourceDecoderFactory.createCsvDecoder(csvSource),
  11. SinkEncoderFactory.createKvEncoder(kvSink));
  12. // case1: 1 + 10
  13. output = processor.transform("1|10||||", new HashMap<>());
  14. Assert.assertEquals(1, output.size());
  15. Assert.assertEquals("result=11", output.get(0));
  16. }

After the above steps, congratulations on completing the implementation of a new parser class, and you can submit your code to the community. The complete code of AdditionParser can be seen at code link

Logic Operator Development Specification

This section introduces how to expand a new logical operator class.

Create Logical Operator Class File

Logical operator classes are stored in this directory. After determining the logical operator you want to expand, create a new class in this directory, and the class name consists of logical operator name + Parser, such as AndOperator.

Basic Code Framework Construction

After creating the class, build the basic framework of the code, taking AndOperator as an example:

  1. @TransformOperator(values = AndExpression.class)
  2. public class AndOperator implements ExpressionOperator {
  3. @Override
  4. public boolean check(SourceData sourceData, int rowIndex, Context context) {
  5. }
  6. }

Add the corresponding @TransformOperator annotation to the logical operator class. The operator class needs to implement the ExpressionOperator interface and override the check method in the interface.

Add Constructor and Member Variables

Add a parameterized constructor and related member variables to the class. In the constructor, parse the input expression and construct the objects needed for the judgment logic in the check method. Taking AndOperator as an example:

  1. private final ExpressionOperator left;
  2. private final ExpressionOperator right;
  3. public AndOperator(AndExpression expr) {
  4. this.left = OperatorTools.buildOperator(expr.getLeftExpression());
  5. this.right = OperatorTools.buildOperator(expr.getRightExpression());
  6. }

Operator Implement

Override the check method, implement the judgment logic according to the definition of the logical operator and the data parsed in the previous step, and output the judgment result (true or false). Taking AndOperator as an example:

  1. @Override
  2. public boolean check(SourceData sourceData, int rowIndex, Context context) {
  3. return OperatorTools.compareValue((Comparable) this.left.parse(sourceData, rowIndex, context),
  4. (Comparable) this.right.parse(sourceData, rowIndex, context)) > 0;
  5. }

Add Unit Test Code

Each logical operator class needs to pass unit tests to verify whether the logic is correct. The unit test class is located in this directory. All unit test functions for each logical operator are placed in the same unit test class, and the unit test class is named in the format of Test + Logical Operator Name + Operator, taking TestAndOperator as an example:

  1. public void testAndOperator() throws Exception {
  2. String transformSql = "select if((string2 < 4) and (numeric4 > 5),1,0) from source";
  3. TransformConfig config = new TransformConfig(transformSql);
  4. // case1: "3.14159265358979323846|3a|4|4"
  5. TransformProcessor<String, String> processor = TransformProcessor
  6. .create(config, SourceDecoderFactory.createCsvDecoder(csvSource),
  7. SinkEncoderFactory.createKvEncoder(kvSink));
  8. List<String> output1 = processor.transform("3.14159265358979323846|3a|4|4");
  9. Assert.assertEquals(1, output1.size());
  10. Assert.assertEquals(output1.get(0), "result=0");
  11. // case2: "3.14159265358979323846|5|4|8"
  12. List<String> output2 = processor.transform("3.14159265358979323846|5|4|8");
  13. Assert.assertEquals(1, output1.size());
  14. Assert.assertEquals(output2.get(0), "result=0");
  15. // case3: "3.14159265358979323846|3|4|8"
  16. List<String> output3 = processor.transform("3.14159265358979323846|3|4|8");
  17. Assert.assertEquals(1, output1.size());
  18. Assert.assertEquals(output3.get(0), "result=1");
  19. transformSql = "select if((numeric3 < 4) and (numeric4 > 5),1,0) from source";
  20. config = new TransformConfig(transformSql);
  21. // case4: "3.14159265358979323846|4|4|8"
  22. processor = TransformProcessor
  23. .create(config, SourceDecoderFactory.createCsvDecoder(csvSource),
  24. SinkEncoderFactory.createKvEncoder(kvSink));
  25. List<String> output4 = processor.transform("3.14159265358979323846|4|4|8");
  26. Assert.assertEquals(1, output1.size());
  27. Assert.assertEquals(output4.get(0), "result=0");
  28. // case5: "3.14159265358979323846|4|3.2|4"
  29. List<String> output5 = processor.transform("3.14159265358979323846|4|3.2|4");
  30. Assert.assertEquals(1, output1.size());
  31. Assert.assertEquals(output5.get(0), "result=0");
  32. // case6: "3.14159265358979323846|4|3.2|8"
  33. List<String> output6 = processor.transform("3.14159265358979323846|4|3.2|8");
  34. Assert.assertEquals(1, output1.size());
  35. Assert.assertEquals(output6.get(0), "result=1");
  36. }

After the above steps, congratulations on completing the implementation of a new logical operator class, and you can submit your code to the community. The complete code of AndOperator can be seen at code link