Skip to content

Configuration of the data warehouse#

Mappings in seed tables#

The implementation repo template will include default Ed-Fi descriptors and defaults for configuration of analytic interpretation. If your implementation has custom descriptors, they must be added to the following dbt seed tables. This reference guide will help you know where to look.

  • xwalk_attendance_events.csv: give attendanceEventDescriptors an analytic interpretation for whether a student is absent or not

  • xwalk_calendar_events.csv: determine which calendarEventDescriptors are and are not school days

  • xwalk_course_level_characteristics.csv: give course characteristics an analytic interpretation in standard categories for grouping of courses

  • xwalk_discipline_actions.csv: assign severity order and indicator (e.g. in-school, out-of-school, expulsion, minor) to discipline actions

  • xwalk_id_types_course.csv: assign types of course codes; common are local and state

  • xwalk_id_types_ed_org.csv: assign types of education organization IDs; common are state, local, and NCES codes

  • xwalk_id_types_staff.csv: assign types of educator IDs, common are state and district / local

  • xwalk_id_types_student.csv: assign types of student IDs; common are state and district / local

  • xwalk_letter_grades.csv: give letter grades an analytic interpretation about whether to include in GPA calculations, GPA points, D/F categorization, and sort orders.

  • metrics_thresholds/absentee_categories.csv: assign thresholds to arbitrary number of categories for absenteeism.

  • student/xwalk_student_characteristics.csv: standardize student characteristics into binary indicators

  • assessments/xwalk_student_assessment_subject: assign subject to a student assessment record based on a score result. This is necessary when an assessment identifier does not map to a single subject (e.g. 'NWEA_MAP_V1')

  • assessments/xwalk_assessment_scores: normalize student assessment score names across assessments to a determined set of scores to be used as columns in fct_student_assessment

  • assessments/xwalk_objective_assessment_scores: normalize student objective assessment score names across assessments to a determined set of scores to be used as columns in fct_student_objective_assessment

Manage configuration options in dbt project configuration#

Each dbt project contains a dbt_project.yml. This is the other place for project-level configuration. In addition to standard dbt project configuration, under the vars: key, you can control the rules that are applied by the edu warehouse. There is a dbt_project.yml within the edu warehouse repo that contains default values for the following. If you want to override or add to these defaults, you can do so by updating the dbt_project.yml in the implementation repo.

Configurable values are either "flattened" with an edu prefix, signifying that they are called explicitly by the edu warehouse code, or "nested", signifying that their structure is dependent on the implementation.

Configuring "flattened" variables:#

"Flattened" configurable variables are formatted as:

vars:
  'edu:CONFIG_DOMAIN:CONFIG_VARIABLE_NAME': CONFIGURED_VALUE

For example,

vars:
  'edu:stu_demos:multiple_races_code': Multiple

Existing "flattened" configurable variables:#

Variable Name Description Example/Default Value*
edu:stu_demos:multiple_races_code The string to display for race_ethnicity in dim_student if a student has multiple values for race in the ODS Multiple
edu:stu_demos:hispanic_latino_code The string to display for race_ethnicity in dim_student if a student has hispanic/latino ethnicity in the ODS Latinx
edu:stu_demos:race_unknown_code The string to display for race_ethnicity in dim_student if a student has no value for race in the ODS Unknown
edu:stu_demos:start_date_column The start date column from StudentSpecialEducationProgramAssociations used to compute whether a student is actively enrolled in special education program(s) -- as recorded in is_special_education_active in dim_student. spec_ed_program_begin_date
edu:stu_demos:exit_date_column The exit date column from StudentSpecialEducationProgramAssociations used to compute whether a student is actively enrolled in special education program(s) -- as recorded in is_special_education_active in dim_student spec_ed_program_end_date
edu:stu_demos:exclude_programs The list of program_name values from StudentSpecialEducationProgramAssociations to exclude when assigning students to is_special_education_active and other special_education variable in dim_student Null
edu:stu_demos:in_attendance_code The string to display for student attendance_event_category in attendance tables if a student does not have an absence/attendance record for a given day they were enrolled In Attendance
edu:stu_demos:chronic_absence_threshold Threshold of attendance rates that count as chronically absent (if 90, students with <90% attendance are chronically absent 90
edu:stu_demos:chronic_absence_min_days Threshold of enrolled days needed to count toward the chronic absence calculation 20
edu:stu_demos:exclude_withdraw_codes The list of exclude_withdraw_type values that should be used to exclude students from fct_student_school_association, and all downstream measures ['No show', 'Invalid enrollment']

* note, the source of truth for default values is in the dbt_project.yml of the edu warehouse repo, not this document.

Configuring "nested" variables that are Ed-Fi Extensions:#

To add any Ed-Fi extension variable to the warehouse, first confirm the data is available in the JSON payload of your implementation's API return under "_ext". Then, add your nested var to the extensions: dict, using this format:

vars:
  extensions:
    EDU_MODEL_NAME:
      EXTENSION_TARGET_COLUMN:
        name: 'EXTENSION AS FOUND IN _ext:'
        dtype: 'DATABASE DATA TYPE TO CONVERT TO'
For example,
vars:
  extensions:
    stg_ef3__student_special_education_program_associations:
      iep_exit_date:
        name: 'my_district_namespace:iepExitDate'
        dtype: 'timestamp'
On your next dbt run, There will be a timestamp column in stg_ef3__student_special_education_program_associations, called iep_exit_date, populated with data from _ext:my_district_namespace:iepExitDate