Skip to content

Commit 433a6eb

Browse files
Lisa OwenDavid Yozie
authored andcommitted
docs - pxf init/sync support to master standby (#7540)
* docs - pxf init/sync support to master standby * edits requested by david * edits requested by francisco and oliver * pxf sync from master TO standby or seg host * identify sync run on master in pxf sync option description
1 parent 689959e commit 433a6eb

16 files changed

Lines changed: 57 additions & 50 deletions

gpdb-doc/markdown/pxf/access_hdfs.html.md.erb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -46,8 +46,8 @@ The PXF agent invokes the HDFS Java API to read the data and delivers it to the
4646

4747
Before working with Hadoop data using PXF, ensure that:
4848

49-
- You have configured and initialized PXF on your Greenplum Database segment hosts, and PXF is running on each host. See [Configuring PXF](instcfg_pxf.html) for additional information.
50-
- You have configured the PXF Hadoop Connectors that you plan to use on each Greenplum Database segment host. Refer to [Configuring PXF Hadoop Connectors](client_instcfg.html) for instructions. If you plan to access JSON-formatted data stored in a Cloudera Hadoop cluster, PXF requires a Cloudera version 5.8 or later Hadoop distribution.
49+
- You have configured and initialized PXF, and PXF is running on each Greenplum Database segment host. See [Configuring PXF](instcfg_pxf.html) for additional information.
50+
- You have configured the PXF Hadoop Connectors that you plan to use. Refer to [Configuring PXF Hadoop Connectors](client_instcfg.html) for instructions. If you plan to access JSON-formatted data stored in a Cloudera Hadoop cluster, PXF requires a Cloudera version 5.8 or later Hadoop distribution.
5151
- If user impersonation is enabled (the default), ensure that you have granted read (and write as appropriate) permission to the HDFS files and directories that will be accessed as external tables in Greenplum Database to each Greenplum Database user/role name that will access the HDFS files and directories. If user impersonation is not enabled, you must grant this permission to the `gpadmin` user.
5252
- Time is synchronized between the Greenplum Database segment hosts and the external Hadoop systems.
5353

gpdb-doc/markdown/pxf/access_objstore.html.md.erb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,8 +27,8 @@ PXF is installed with connectors to Azure Blob Storage, Azure Data Lake, Google
2727

2828
Before working with object store data using PXF, ensure that:
2929

30-
- You have configured and initialized PXF on your Greenplum Database segment hosts, and PXF is running on each host. See [Configuring PXF](instcfg_pxf.html) for additional information.
31-
- You have configured the PXF Object Store Connectors that you plan to use on each Greenplum Database segment host. Refer to [Configuring Connectors to Azure, Google Cloud Storage, Minio, and S3 Object Stores](objstore_cfg.html) for instructions.
30+
- You have configured and initialized PXF, and PXF is running on each Greenplum Database segment host. See [Configuring PXF](instcfg_pxf.html) for additional information.
31+
- You have configured the PXF Object Store Connectors that you plan to use. Refer to [Configuring Connectors to Azure, Google Cloud Storage, Minio, and S3 Object Stores](objstore_cfg.html) for instructions.
3232
- Time is synchronized between the Greenplum Database segment hosts and the external object store systems.
3333

3434

gpdb-doc/markdown/pxf/cfginitstart_pxf.html.md.erb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ PXF provides two management commands:
2727
- `pxf cluster` - manage all PXF service instances in the Greenplum Database cluster
2828
- `pxf` - manage the PXF service instance on a specific Greenplum Database host
2929

30-
The [`pxf cluster`](ref/pxf-cluster.html) command supports `init`, `start`, `stop`, and `sync` subcommands. When you run a `pxf cluster` subcommand on the Greenplum Database master host, you perform the operation on all segment hosts in the Greenplum Database cluster.
30+
The [`pxf cluster`](ref/pxf-cluster.html) command supports `init`, `start`, `stop`, and `sync` subcommands. When you run a `pxf cluster` subcommand on the Greenplum Database master host, you perform the operation on all segment hosts in the Greenplum Database cluster. PXF also runs the `init` and `sync` commands on the standby master host.
3131

3232
The [`pxf`](ref/pxf.html) command supports `init`, `start`, `stop`, `restart`, and `status` operations. These operations run locally. That is, if you want to start or stop the PXF agent on a specific Greenplum Database segment host, you log in to the host and run the command.
3333

@@ -54,7 +54,7 @@ The `pxf-env.sh` file exposes the following PXF runtime configuration parameters
5454
| PXF_KEYTAB | The absolute path to the PXF service Kerberos principal keytab file. | $PXF_CONF/keytabs/pxf.service.keytab |
5555
| PXF_PRINCIPAL | The PXF service Kerberos principal. | gpadmin/\_HOST@EXAMPLE.COM |
5656

57-
You must synchronize any changes that you make to `pxf-env.sh`, `pxf-log4j.properties`, or `pxf-profiles.xml` to each Greenplum Database segment host, and (re)start PXF on each host.
57+
You must synchronize any changes that you make to `pxf-env.sh`, `pxf-log4j.properties`, or `pxf-profiles.xml` to the Greenplum Database cluster, and (re)start PXF on each segment host.
5858

5959
### <a id="start_pxf_prereq" class="no-quick-link"></a>Prerequisites
6060

gpdb-doc/markdown/pxf/client_instcfg.html.md.erb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -10,13 +10,13 @@ PXF is compatible with Cloudera, Hortonworks Data Platform, MapR, and generic Ap
1010

1111
Configuring PXF Hadoop connectors involves copying configuration files from your Hadoop cluster to the Greenplum Database master host. If you are using the MapR Hadoop distribution, you must also copy certain JAR files to the master host. Before you configure the PXF Hadoop connectors, ensure that you can copy files from hosts in your Hadoop cluster to the Greenplum Database master.
1212

13-
In this procedure, you copy Hadoop configuration files to the `$PXF_CONF/servers/default` directory on the Greenplum Database master host. You may also copy libraries to `$PXF_CONF/lib` for MapR support. You then synchronize the PXF configuration on the master host to the segment hosts. (PXF creates the`$PXF_CONF/*` directories when you run `pxf cluster init`.)
13+
In this procedure, you copy Hadoop configuration files to the `$PXF_CONF/servers/default` directory on the Greenplum Database master host. You may also copy libraries to `$PXF_CONF/lib` for MapR support. You then synchronize the PXF configuration on the master host to the standby master and segment hosts. (PXF creates the`$PXF_CONF/*` directories when you run `pxf cluster init`.)
1414

1515
**Note**: After you complete the configuration procedure, you will have configured the PXF default Hadoop server. End users need not provide a `SERVER` option in a `CREATE EXTERNAL TABLE` command when they access the default Hadoop server configuration.
1616

1717
## <a id="client-pxf-config-steps"></a>Procedure
1818

19-
Perform the following procedure to configure the desired PXF Hadoop-related connectors on the Greenplum Database master host. After you configure the connectors, you will use the `pxf cluster sync` command to copy the PXF configuration to the segment hosts in your Greenplum Database cluster.
19+
Perform the following procedure to configure the desired PXF Hadoop-related connectors on the Greenplum Database master host. After you configure the connectors, you will use the `pxf cluster sync` command to copy the PXF configuration to the Greenplum Database cluster.
2020

2121
1. Log in to your Greenplum Database master node:
2222

@@ -55,7 +55,7 @@ Perform the following procedure to configure the desired PXF Hadoop-related conn
5555
gpadmin@gpmaster$ scp mapruser@maprhost:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/common/hadoop-common-2.7.0-mapr-1707.jar .
5656
```
5757

58-
5. Synchronize the PXF configuration to each Greenplum Database segment host. For example:
58+
5. Synchronize the PXF configuration to the Greenplum Database cluster. For example:
5959

6060
``` shell
6161
gpadmin@gpmaster$ $GPHOME/pxf/bin/pxf cluster sync

gpdb-doc/markdown/pxf/hdfs_seqfile.html.md.erb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -220,7 +220,7 @@ public class PxfExample_CustomWritable implements Writable {
220220
gpadmin@gpmaster$ cp /home/gpadmin/pxfex-customwritable.jar /usr/local/greenplum-pxf/lib/pxfex-customwritable.jar
221221
```
222222

223-
5. Synchronize the PXF configuration to each Greenplum Database segment host. For example:
223+
5. Synchronize the PXF configuration to the Greenplum Database cluster. For example:
224224

225225
``` shell
226226
gpadmin@gpmaster$ $GPHOME/pxf/bin/pxf cluster sync

gpdb-doc/markdown/pxf/init_pxf.html.md.erb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,13 +56,13 @@ Perform the following procedure to initialize PXF on each segment host in your G
5656
$ ssh gpadmin@<gpmaster>
5757
```
5858

59-
4. Run the `pxf cluster init` command to initialize the PXF service on the master and on each segment host. For example, the following command specifies `/usr/local/greenplum-pxf` as the PXF user configuration directory for initialization:
59+
4. Run the `pxf cluster init` command to initialize the PXF service on the master, standby master, and on each segment host. For example, the following command specifies `/usr/local/greenplum-pxf` as the PXF user configuration directory for initialization:
6060

6161
``` shell
6262
gpadmin@gpmaster$ PXF_CONF=/usr/local/greenplum-pxf $GPHOME/pxf/bin/pxf cluster init
6363
```
6464

6565
The `init` command creates the PXF web application and initializes the internal PXF configuration. The `init` command also creates the `$PXF_CONF` user configuration directory if it does not exist, and populates the directory with user-customizable configuration templates.
6666

67-
**Note**: The PXF service runs only on the segment hosts. However,`pxf cluster init` also sets up the PXF user configuration directories on the Greenplum Database master host.
67+
**Note**: The PXF service runs only on the segment hosts. However,`pxf cluster init` also sets up the PXF user configuration directories on the Greenplum Database master and standby master hosts.
6868

gpdb-doc/markdown/pxf/install_java.html.md.erb

Lines changed: 7 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,35 +13,36 @@ Ensure that you have access to, or superuser permissions to install, Java versio
1313

1414
## <a id="proc"></a>Procedure
1515

16-
Perform the following procedure to install Java on the master and on each segment host in your Greenplum Database cluster. You will use the `gpssh` utility where possible to run a command on multiple hosts.
16+
Perform the following procedure to install Java on the master, standby master, and on each segment host in your Greenplum Database cluster. You will use the `gpssh` utility where possible to run a command on multiple hosts.
1717

1818
1. Log in to your Greenplum Database master node:
1919

2020
``` shell
2121
$ ssh gpadmin@<gpmaster>
2222
```
2323

24-
2. Create a text file that lists your Greenplum Database segment hosts, one host name per line. For example, a file named `seghostfile` may include:
24+
2. Create a text file that lists your Greenplum Database standby master host and segment hosts, one host name per line. For example, a file named `gphostfile` may include:
2525

2626
``` pre
27+
mstandby
2728
seghost1
2829
seghost2
2930
seghost3
3031
```
3132

32-
3. Install Java on the master and on each Greenplum Database segment host, and then set up the Java environment on each host.
33+
3. Install Java on the master, standby master, and on each Greenplum Database segment host, and then set up the Java environment on each host.
3334

3435
1. Install the Java package. For example, to install Java version 1.8:
3536

3637
``` shell
3738
gpadmin@gpmaster$ sudo yum -y install java-1.8.0-openjdk-1.8.0*
38-
gpadmin@gpmaster$ gpssh -e -v -f seghostfile sudo yum -y install java-1.8.0-openjdk-1.8.0*
39+
gpadmin@gpmaster$ gpssh -e -v -f gphostfile sudo yum -y install java-1.8.0-openjdk-1.8.0*
3940
```
4041

41-
2. Identify the Java base install directory. Update the `gpadmin` user's `.bashrc` file on each segment host to include this `$JAVA_HOME` setting if it is not already present. For example:
42+
2. Identify the Java base install directory. Update the `gpadmin` user's `.bashrc` file on each host to include this `$JAVA_HOME` setting if it is not already present. For example:
4243

4344
``` shell
4445
gpadmin@gpmaster$ echo 'export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.x86_64/jre' >> /home/gpadmin/.bashrc
45-
gpadmin@gpmaster$ gpssh -e -v -f seghostfile "echo 'export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.x86_64/jre' >> /home/gpadmin/.bashrc"
46+
gpadmin@gpmaster$ gpssh -e -v -f gphostfile "echo 'export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.x86_64/jre' >> /home/gpadmin/.bashrc"
4647
```
4748

gpdb-doc/markdown/pxf/jdbc_cfg.html.md.erb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ When you configure the PXF JDBC Connector to access an external SQL database, yo
4242
2. Create the directory `$PXF_CONF/servers/<server_name>`.
4343
3. Copy the PXF `jdbc-site.xml` template configuration file to the new server directory.
4444
4. Fill in appropriate values for the properties in the template file.
45-
6. Synchronize the server configuration to each Greenplum Database segment host.
45+
6. Synchronize the server configuration to the Greenplum Database cluster.
4646
7. Publish the PXF server name(s) to your Greenplum Database end users as appropriate.
4747

4848
The Greenplum Database user specifies the `<server_name>` in the `CREATE EXTERNAL TABLE` `LOCATION` clause `SERVER` option to access the external SQL database. For example, if you created a server configuration and named the server directory `pgsrvcfg`:
@@ -61,7 +61,7 @@ While not recommended, you can override a JDBC server configuration by directly
6161

6262
Ensure that you have initialized PXF before you configure a JDBC Connector server.
6363

64-
In this procedure, you name and add a PXF JDBC server configuration for a PostgreSQL database and synchronize the server configuration(s) to all segment hosts.
64+
In this procedure, you name and add a PXF JDBC server configuration for a PostgreSQL database and synchronize the server configuration(s) to the Greenplum Database cluster.
6565

6666
1. Log in to your Greenplum Database master node:
6767

@@ -110,7 +110,7 @@ In this procedure, you name and add a PXF JDBC server configuration for a Postgr
110110
```
111111
6. Save your changes and exit the editor.
112112

113-
7. Use the `pxf cluster sync` command to copy the new server configurations to each Greenplum Database segment host. For example:
113+
7. Use the `pxf cluster sync` command to copy the new server configuration to the Greenplum Database cluster. For example:
114114

115115
``` shell
116116
gpadmin@gpmaster$ $GPHOME/pxf/bin/pxf cluster sync

gpdb-doc/markdown/pxf/jdbc_pxf.html.md.erb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ This section describes how to use the PXF JDBC connector to access data in an ex
3131

3232
Before you access an external SQL database using the PXF JDBC connector, ensure that:
3333

34-
- You have configured and initialized PXF on your Greenplum Database segment hosts, and PXF is running on each host. See [Configuring PXF](instcfg_pxf.html) for additional information.
34+
- You have configured and initialized PXF, and PXF is running on each Greenplum Database segment host. See [Configuring PXF](instcfg_pxf.html) for additional information.
3535
- You can identify the PXF user configuration directory (`$PXF_CONF`).
3636
- Connectivity exists between all Greenplum Database segment hosts and the external SQL database.
3737
- You have configured your external SQL database for user access from all Greenplum Database segment hosts.
@@ -222,7 +222,7 @@ Perform the following steps to create a PostgreSQL table named `forpxf_table1` i
222222

223223
#### <a id="ex_jdbconfig"></a>Configure the JDBC Connector
224224

225-
You must create a JDBC server configuration for PostgreSQL, download the PostgreSQL driver JAR file to your system, copy the JAR file to the PXF user configuration directory, synchronize PXF configuration, and then restart PXF.
225+
You must create a JDBC server configuration for PostgreSQL, download the PostgreSQL driver JAR file to your system, copy the JAR file to the PXF user configuration directory, synchronize the PXF configuration, and then restart PXF.
226226

227227
1. Log in to the Greenplum Database master node:
228228

@@ -262,7 +262,7 @@ You must create a JDBC server configuration for PostgreSQL, download the Postgre
262262
gpadmin@gpmaster$ cp postgresql-42.2.5.jar $PXF_CONF/lib/postgresql-42.2.5.jar
263263
```
264264

265-
3. Synchronize PXF configuration to all Greenplum Database segment hosts. For example:
265+
3. Synchronize the PXF configuration to the Greenplum Database cluster. For example:
266266

267267
``` shell
268268
gpadmin@gpmaster$ $GPHOME/pxf/bin/pxf cluster sync

gpdb-doc/markdown/pxf/objstore_cfg.html.md.erb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ When you configure a PXF object store connector, you add at least one named PXF
4646
3. Copy the PXF template configuration file corresponding to the object store to the new server directory.
4747
4. Fill in appropriate values for the properties in the template file.
4848
5. Add additional properties and values if required for your environment.
49-
6. Synchronize the server configuration to each Greenplum Database segment host.
49+
6. Synchronize the server configuration to the Greenplum Database cluster.
5050
7. Publish the PXF server names to your Greenplum Database end users as appropriate.
5151

5252
The Greenplum Database user specifies the server name in the `CREATE EXTERNAL TABLE` `LOCATION` clause `SERVER` option to access the object store. For example:
@@ -224,7 +224,7 @@ To enable SSE-C for a specific S3 bucket, use the property name variants that in
224224

225225
Ensure that you have initialized PXF before you configure an object store connector.
226226

227-
In this procedure, you name and add a PXF server configuration in the `$PXF_CONF/servers` directory on the Greenplum Database master host for each object store connector that you plan to use. You then use the `pxf cluster sync` command to sync the server configuration(s) to all segment hosts.
227+
In this procedure, you name and add a PXF server configuration in the `$PXF_CONF/servers` directory on the Greenplum Database master host for each object store connector that you plan to use. You then use the `pxf cluster sync` command to sync the server configuration(s) to the Greenplum Database cluster.
228228

229229
1. Log in to your Greenplum Database master node:
230230

@@ -275,7 +275,7 @@ In this procedure, you name and add a PXF server configuration in the `$PXF_CONF
275275

276276
6. Repeat Step 3 to configure the next object store connector.
277277

278-
4. Use the `pxf cluster sync` command to copy the new server configurations to each Greenplum Database segment host. For example:
278+
4. Use the `pxf cluster sync` command to copy the new server configurations to the Greenplum Database cluster. For example:
279279

280280
``` shell
281281
gpadmin@gpmaster$ $GPHOME/pxf/bin/pxf cluster sync

0 commit comments

Comments
 (0)