Running COPY in Single Row Error Isolation Mode
By default, COPY
stops an operation at the first error: if the data contains an error, the operation fails and no data loads. If you run COPY FROM
in single row error isolation mode, HAWQ skips rows that contain format errors and loads properly formatted rows. Single row error isolation mode applies only to rows in the input file that contain format errors. If the data contains a contraint error such as violation of a NOT NULL
or CHECK
constraint, the operation fails and no data loads.
Specifying SEGMENT REJECT LIMIT
runs the COPY
operation in single row error isolation mode. Specify the acceptable number of error rows on each segment, after which the entire COPY FROM
operation fails and no rows load. The error row count is for each HAWQ segment, not for the entire load operation.
If the COPY
operation does not reach the error limit, HAWQ loads all correctly-formatted rows and discards the error rows. The LOG ERRORS INTO
clause allows you to keep error rows for further examination. Use LOG ERRORS
to capture data formatting errors internally in HAWQ. For example:
=> COPY country FROM '/data/gpdb/country_data'
WITH DELIMITER '|' LOG ERRORS INTO errtable
SEGMENT REJECT LIMIT 10 ROWS;