Bobcares

Configuration Steps for Apache Airflow Docker Compose

by | Sep 20, 2022

Let us take a closer look at the apache airflow Docker Compose configuration and set it up with the support of our Server Management support services at Bobcares.

Step-By-Step configuration of Apache Airflow Docker Compose

apache airflow docker compose

Prerequisites

As we will be using docker-compose to get Airflow up and running, we must first install Docker. Simply go to the official Docker website and get the proper installation file for your operating system.

Step 1: Create a new folder

We may begin the procedure by simply establishing a new folder called Airflow.

Simply navigate to a directory using any terminal, create a new folder, and change into it by running:

mkdir airflow
cd airflow

Step 2: Create a docker-compose file

The next step is to obtain a docker-compose file that specifies the required services or Docker containers.

We can run the following command from the terminal within the newly formed Airflow folder.

curl https://raw.githubusercontent.com/marvinlanhenke/Airflow/main/01GettingStarted/docker-compose.yml -o docker-compose.yml

Alternatively, simply create a new file called docker-compose.yml and paste the following text into it.

---
version: '3.4'
x-common:
&common
image: apache/airflow:2.3.0
user: "${AIRFLOW_UID}:0"
env_file:
- .env
volumes:
- ./dags:/opt/airflow/dags
- ./logs:/opt/airflow/logs
- ./plugins:/opt/airflow/plugins
- /var/run/docker.sock:/var/run/docker.sock

x-depends-on:
&depends-on
depends_on:
postgres:
condition: service_healthy
airflow-init:
condition: service_completed_successfully

services:
postgres:
image: postgres:13
container_name: postgres
ports:
- "5434:5432"
healthcheck:
test: ["CMD", "pg_isready", "-U", "airflow"]
interval: 5s
retries: 5
env_file: - .env

scheduler:
<<: *common
<<: *depends-on
container_name: airflow-scheduler
command: scheduler
restart: on-failure
ports:
- "8793:8793"

webserver:
<<: *common <<: *depends-on
container_name: airflow-webserver
restart: always
command: webserver
ports:
- "8080:8080"
healthcheck:
test: ["CMD", "curl", "--fail", "http://localhost:8080/health"]
interval: 30s
timeout: 30s
retries: 5

airflow-init:
<<: *common
container_name: airflow-init
entrypoint: /bin/bash
command:
- -c
- |
mkdir -p /sources/logs /sources/dags /sources/plugins
chown -R "${AIRFLOW_UID}:0" /sources/{logs,dags,plugins}
exec /entrypoint airflow version

The docker-compose file above just defines the services required to get Airflow up and running. The scheduler, webserver, metadatabase (PostgreSQL), and the airflow-init job that initializes the database are the most significant.

We can use certain local variables at the top of the code that are common in every docker container or service.

Step 3: Environment variables

We successfully constructed a docker-compose file containing the required services. However, in order to complete the installation and properly set up Airflow, we must provide some environment variables.

Still, create a.env file in the Airflow folder with the following contents:

# Meta-Database
POSTGRES_USER=airflow
POSTGRES_PASSWORD=airflow
POSTGRES_DB=airflow

# Airflow Core AIRFLOW__CORE__FERNET_KEY=UKMzEm3yIuFYEq1y3-2FxPNWSVwRASpahmQ9kQfEr8E= AIRFLOW__CORE__EXECUTOR=LocalExecutor AIRFLOW__CORE__DAGS_ARE_PAUSED_AT_CREATION=True AIRFLOW__CORE__LOAD_EXAMPLES=False AIRFLOW_UID=0

# Backend DB AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=postgresql+psycopg2://airflow:airflow@postgres/airflow AIRFLOW__DATABASE__LOAD_DEFAULT_CONNECTIONS=False

# Airflow Init _AIRFLOW_DB_UPGRADE=True _AIRFLOW_WWW_USER_CREATE=True _AIRFLOW_WWW_USER_USERNAME=airflow _AIRFLOW_WWW_USER_PASSWORD=airflow

The variables listed above define the database credentials, the airflow user, and various other settings.

Most crucially, the type of executor Airflow will be used. We use the LocalExecutor in this situation.

Step 4: Run docker-compose

Then, go to the console and perform the following command to start all of the required containers:

docker compose up -d

We may examine the results and the Airflow Web UI after a short time by visiting http://localhost:8080. We gain access to the user interface after signing in with the credentials (airflow: airflow).

This is the final step in the process and the Airflow will install successfully using docker-compose.

[Need assistance with similar queries? We are here to help]

Conclusion

To conclude we have now learned more about the configurations to set up the apache airflow Docker Compose. The configuration process consists of four main steps put forward by our Server management support services for easy configuration.

PREVENT YOUR SERVER FROM CRASHING!

Never again lose customers to poor server speed! Let us help you.

Our server experts will monitor & maintain your server 24/7 so that it remains lightning fast and secure.

GET STARTED

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Never again lose customers to poor
server speed! Let us help you.

Privacy Preference Center

Necessary

Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function properly without these cookies.

PHPSESSID - Preserves user session state across page requests.

gdpr[consent_types] - Used to store user consents.

gdpr[allowed_cookies] - Used to store user allowed cookies.

PHPSESSID, gdpr[consent_types], gdpr[allowed_cookies]
PHPSESSID
WHMCSpKDlPzh2chML

Statistics

Statistic cookies help website owners to understand how visitors interact with websites by collecting and reporting information anonymously.

_ga - Preserves user session state across page requests.

_gat - Used by Google Analytics to throttle request rate

_gid - Registers a unique ID that is used to generate statistical data on how you use the website.

smartlookCookie - Used to collect user device and location information of the site visitors to improve the websites User Experience.

_ga, _gat, _gid
_ga, _gat, _gid
smartlookCookie
_clck, _clsk, CLID, ANONCHK, MR, MUID, SM

Marketing

Marketing cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.

IDE - Used by Google DoubleClick to register and report the website user's actions after viewing or clicking one of the advertiser's ads with the purpose of measuring the efficacy of an ad and to present targeted ads to the user.

test_cookie - Used to check if the user's browser supports cookies.

1P_JAR - Google cookie. These cookies are used to collect website statistics and track conversion rates.

NID - Registers a unique ID that identifies a returning user's device. The ID is used for serving ads that are most relevant to the user.

DV - Google ad personalisation

_reb2bgeo - The visitor's geographical location

_reb2bloaded - Whether or not the script loaded for the visitor

_reb2bref - The referring URL for the visit

_reb2bsessionID - The visitor's RB2B session ID

_reb2buid - The visitor's RB2B user ID

IDE, test_cookie, 1P_JAR, NID, DV, NID
IDE, test_cookie
1P_JAR, NID, DV
NID
hblid
_reb2bgeo, _reb2bloaded, _reb2bref, _reb2bsessionID, _reb2buid

Security

These are essential site cookies, used by the google reCAPTCHA. These cookies use an unique identifier to verify if a visitor is human or a bot.

SID, APISID, HSID, NID, PREF
SID, APISID, HSID, NID, PREF