Accessing HDFS Data with gphdfs (Deprecated)

Greenplum Database leverages the parallel architecture of a Hadoop Distributed File System to read and write data files efficiently using the gphdfs protocol.

Note: The gphdfs external table protocol is deprecated and will be removed in the next major release of Greenplum Database. Consider using the Greenplum Platform Extension Framework (PXF) pxf external table protocol to access data stored in a Hadoop file system.

There are three steps to using the gphdfs protocol with HDFS:

For information about using Greenplum Database external tables with Amazon EMR when Greenplum Database is installed on Amazon Web Services (AWS), also see Using Amazon EMR with Greenplum Database installed on AWS (Deprecated).

Parent topic: Working with External Data