Creating External Tables - Examples

The following examples show how to define external data with different protocols. Each CREATE EXTERNAL TABLE command can contain only one protocol.

Note: When using IPv6, always enclose the numeric IP addresses in square brackets.

Start gpfdist before you create external tables with the gpfdist protocol. The following code starts the gpfdist file server program in the background on port 8081 serving files from directory /var/data/staging. The logs are saved in /home/gpadmin/log.

$ gpfdist -p 8081 -d /var/data/staging -l /home/gpadmin/log &

Example 1 - Single gpfdist instance on single-NIC machine

Creates a readable external table, ext_expenses, using the gpfdist protocol. The files are formatted with a pipe (|) as the column delimiter.

=# CREATE EXTERNAL TABLE ext_expenses
        ( name text, date date, amount float4, category text, desc1 text )
    LOCATION ('gpfdist://etlhost-1:8081/*')
    FORMAT 'TEXT' (DELIMITER '|');

Example 2 - Multiple gpfdist instances

Creates a readable external table, ext_expenses, using the gpfdist protocol from all files with the txt extension. The column delimiter is a pipe ( | ) and NULL is a space (’ ’).

=# CREATE EXTERNAL TABLE ext_expenses
        ( name text, date date, amount float4, category text, desc1 text )
    LOCATION ('gpfdist://etlhost-1:8081/*.txt', 'gpfdist://etlhost-2:8081/*.txt')
    FORMAT 'TEXT' ( DELIMITER '|' NULL ' ') ;

Example 3 - Multiple gpfdists instances

Creates a readable external table, ext_expenses, from all files with the txt extension using the gpfdists protocol. The column delimiter is a pipe ( | ) and NULL is a space (’ ’). For information about the location of security certificates, see gpfdists Protocol.

  1. Run gpfdist with the --ssl option.
  2. Run the following command.

    =# CREATE EXTERNAL TABLE ext_expenses
             ( name text, date date, amount float4, category text, desc1 text )
        LOCATION ('gpfdists://etlhost-1:8081/*.txt', 'gpfdists://etlhost-2:8082/*.txt')
        FORMAT 'TEXT' ( DELIMITER '|' NULL ' ') ;
    
    

Example 4 - Single gpfdist instance with error logging

Uses the gpfdist protocol to create a readable external table, ext_expenses, from all files with the txt extension. The column delimiter is a pipe ( | ) and NULL (’ ’) is a space.

Access to the external table is single row error isolation mode. Input data formatting errors can be captured so that you can view the errors, fix the issues, and then reload the rejected data. If the error count on a segment is greater than five (the SEGMENT REJECT LIMIT value), the entire external table operation fails and no rows are processed.

=# CREATE EXTERNAL TABLE ext_expenses
         ( name text, date date, amount float4, category text, desc1 text )
    LOCATION ('gpfdist://etlhost-1:8081/*.txt', 'gpfdist://etlhost-2:8082/*.txt')
    FORMAT 'TEXT' ( DELIMITER '|' NULL ' ')
    LOG ERRORS INTO expenses_errs SEGMENT REJECT LIMIT 5;

To create the readable ext_expenses table from CSV-formatted text files:

=# CREATE EXTERNAL TABLE ext_expenses
         ( name text, date date, amount float4, category text, desc1 text )
    LOCATION ('gpfdist://etlhost-1:8081/*.txt', 'gpfdist://etlhost-2:8082/*.txt')
    FORMAT 'CSV' ( DELIMITER ',' )
    LOG ERRORS INTO expenses_errs SEGMENT REJECT LIMIT 5;

Example 5 - Readable Web External Table with Script

Creates a readable web external table that executes a script once on five virtual segments:

=# CREATE EXTERNAL WEB TABLE log_output (linenum int, message text)
    EXECUTE '/var/load_scripts/get_log_data.sh' ON 5
    FORMAT 'TEXT' (DELIMITER '|');

Example 6 - Writable External Table with gpfdist

Creates a writable external table, sales_out, that uses gpfdist to write output data to the file sales.out. The column delimiter is a pipe ( | ) and NULL is a space (’ ’). The file will be created in the directory specified when you started the gpfdist file server.

=# CREATE WRITABLE EXTERNAL TABLE sales_out (LIKE sales)
    LOCATION ('gpfdist://etl1:8081/sales.out')
    FORMAT 'TEXT' ( DELIMITER '|' NULL ' ')
    DISTRIBUTED BY (txn_id);

Example 7 - Writable External Web Table with Script

Creates a writable external web table, campaign_out, that pipes output data recieved by the segments to an executable script, to_adreport_etl.sh:

=# CREATE WRITABLE EXTERNAL WEB TABLE campaign_out
        (LIKE campaign)
        EXECUTE '/var/unload_scripts/to_adreport_etl.sh' ON 6
        FORMAT 'TEXT' (DELIMITER '|');

Example 8 - Readable and Writable External Tables with XML Transformations

HAWQ can read and write XML data to and from external tables with gpfdist. For information about setting up an XML transform, see Transforming XML Data.