To allow users submit data requests to a cluster concurrently, Dataflow provisions the creation of multiple Livy Sessions. Data Analysts, ETL Developers and ETL Testers can create multiple Livy Sessions to ingest, query, transform, analyze, and synchronize data. This enables Spark to process each Livy request that carries a task across data sets and returns results in matter of seconds.

Add or Edit a Livy Server

To add a Livy Server,

  1. Navigate to Admin Settings > Livy > Livy Configuration.
  2. Click Livy +.
  3. Under Server Configuration, complete the following fields
    • Name. Enter a name for the Livy Server.
    • Livy Server URL. Specify the URL of the Livy Server.
    • Cluster Type. Available cluster types are Livy, AWS EMR, HDInsight, HortonWorks, and Kubernetes.
    • Authentication Type. Select any of the following authentication types.
      • No Authentication
      • Basic. Enter username and password.
      • Kerberos-TicketCache. Select whether you want to impersonate users.
      • Kerberos-KeytabFile.  A Keytab file contains pairs of Kerberos principals and encrypted keys. Use a Keytab file to authenticate users using Kerberos without entering a password. When you change your Kerberos password, you will need to recreate all your keytabs. Choose between Impersonate Users and User specific keytab files. 
      • OAuth. This is an authorization framework that enables DataOps Suite to obtain limited access to user accounts. 
  4. Under Livy Session Properties, complete the following fields:
    • Kind. The session kind of the Livy Server. The session kind includes Interactive Scala Spark, Python Spark, and R Spark.
    • proxyUser. The user to impersonate when running the job or request. 
    • pyFiles. Python files that will be used in the session. 
    • files. Files that will be used in the session.
    • driverMemory. The amount of memory to use for the driver process. 
    • driverCores. Number of cores to use for the driver process 
    • executorMemory. The amount of memory to use per executor process 
    • executorCores. The number of cores to use for each executor. 
    • numExecutors. The number of executors to launch for this session 
    • archives. Archives that need to be used in the session. 
    • queue. The name of the YARN queue to which job/request is submitted 
    • name. The name of the session.
    • conf. Specify Spark configuration properties. The properties should be entered in the format as follows: {"key1":"val1","key2":"val2"}.
    • heartbeatTimeoutinSecond. The timeout in seconds after which the session will be orphaned.
  5. Click Test to verify Livy configuration.
  6. Click Save.

Delete a Livy Server

To delete a Livy session,

  1. Navigate to Admin Settings > Livy > Livy Sessions
  2. Select the Livy session and click the Delete icon.

© Datagaps. All rights reserved.
Send feedback on this topic to Datagaps Support