Configuration of the data warehouse#
Mappings in seed tables#
The implementation repo template will include default Ed-Fi descriptors and defaults for configuration of analytic interpretation. If your implementation has custom descriptors, they must be added to the following dbt seed tables. This reference guide will help you know where to look.
xwalk_attendance_events.csv: give attendanceEventDescriptors an analytic interpretation for whether a student is absent or not
xwalk_calendar_events.csv: determine which calendarEventDescriptors are and are not school days
xwalk_course_level_characteristics.csv: give course characteristics an analytic interpretation in standard categories for grouping of courses
xwalk_discipline_actions.csv: assign severity order and indicator (e.g. in-school, out-of-school, expulsion, minor) to discipline actions
xwalk_id_types_course.csv: assign types of course codes; common are local and state
xwalk_id_types_ed_org.csv: assign types of education organization IDs; common are state, local, and NCES codes
xwalk_id_types_staff.csv: assign types of educator IDs, common are state and district / local
xwalk_id_types_student.csv: assign types of student IDs; common are state and district / local
xwalk_letter_grades.csv: give letter grades an analytic interpretation about whether to include in GPA calculations, GPA points, D/F categorization, and sort orders.
metrics_thresholds/absentee_categories.csv: assign thresholds to arbitrary number of categories for absenteeism.
student/xwalk_student_characteristics.csv: standardize student characteristics into binary indicators
assessments/xwalk_student_assessment_subject: assign subject to a student assessment record based on a score result. This is necessary when an assessment identifier does not map to a single subject (e.g.
assessments/xwalk_assessment_scores: normalize student assessment score names across assessments to a determined set of scores to be used as columns in
assessments/xwalk_objective_assessment_scores: normalize student objective assessment score names across assessments to a determined set of scores to be used as columns in
Manage configuration options in dbt project configuration#
Each dbt project contains a dbt_project.yml. This is the other place for project-level configuration. In addition to standard dbt project configuration, under the
vars: key, you can control the rules that are applied by the
warehouse. There is a dbt_project.yml within the
warehouse repo that contains default values for the following. If you want to override or add to these defaults, you can do so by updating the dbt_project.yml in the implementation repo.
Configurable values are either "flattened" with an edu prefix, signifying that they are called explicitly by the edu warehouse code, or "nested", signifying that their structure is dependent on the implementation.
Configuring "flattened" variables:#
"Flattened" configurable variables are formatted as:
Existing "flattened" configurable variables:#
|Variable Name||Description||Example/Default Value*|
|edu:stu_demos:multiple_races_code||The string to display for race_ethnicity in dim_student if a student has multiple values for race in the ODS||Multiple|
|edu:stu_demos:hispanic_latino_code||The string to display for race_ethnicity in dim_student if a student has hispanic/latino ethnicity in the ODS||Latinx|
|edu:stu_demos:race_unknown_code||The string to display for race_ethnicity in dim_student if a student has no value for race in the ODS||Unknown|
|edu:stu_demos:start_date_column||The start date column from StudentSpecialEducationProgramAssociations used to compute whether a student is actively enrolled in special education program(s) -- as recorded in is_special_education_active in dim_student.||spec_ed_program_begin_date|
|edu:stu_demos:exit_date_column||The exit date column from StudentSpecialEducationProgramAssociations used to compute whether a student is actively enrolled in special education program(s) -- as recorded in is_special_education_active in dim_student||spec_ed_program_end_date|
|edu:stu_demos:exclude_programs||The list of program_name values from StudentSpecialEducationProgramAssociations to exclude when assigning students to is_special_education_active and other special_education variable in dim_student||Null|
|edu:stu_demos:in_attendance_code||The string to display for student attendance_event_category in attendance tables if a student does not have an absence/attendance record for a given day they were enrolled||In Attendance|
|edu:stu_demos:chronic_absence_threshold||Threshold of attendance rates that count as chronically absent (if 90, students with <90% attendance are chronically absent||90|
|edu:stu_demos:chronic_absence_min_days||Threshold of enrolled days needed to count toward the chronic absence calculation||20|
|edu:stu_demos:exclude_withdraw_codes||The list of exclude_withdraw_type values that should be used to exclude students from fct_student_school_association, and all downstream measures||['No show', 'Invalid enrollment']|
* note, the source of truth for default values is in the dbt_project.yml of the edu warehouse repo, not this document.
Configuring "nested" variables that are Ed-Fi Extensions:#
To add any Ed-Fi extension variable to the warehouse, first confirm the data is available in the JSON payload of your implementation's API return under "_ext". Then, add your nested var to the
extensions: dict, using this format:
vars: extensions: EDU_MODEL_NAME: EXTENSION_TARGET_COLUMN: name: 'EXTENSION AS FOUND IN _ext:' dtype: 'DATABASE DATA TYPE TO CONVERT TO'
vars: extensions: stg_ef3__student_special_education_program_associations: iep_exit_date: name: 'my_district_namespace:iepExitDate' dtype: 'timestamp'