PowerShell to scrape a webpage and get school term dates

5 Apr

So, continuing on from my initial PowerShell script engine to control my home automation, I thought it would be nice to check the term dates for the schools around us. Not only are the roads a fair bit quieter out of term, but also the kids don’t need to be up quite as early, and as a result, I can stay in bed that bit longer, in theory anyway.

The only list of term dates I could find was on the Birmingham Gov website. After a bit of investigation, I found that all the data I wanted was inside a class called “Editor”, and within that, the text followed the same pattern. With a bit of replacing, I stripped it down to a load of date components which I then built up into dates which represented the term start and end date, and within that, each half term start and end date.

I assume that were not in a term date to start and then I check if today’s date is in any of the term start and end date ranges, but also not within the half term start and end date ranges.

A little bit hacky, and if the page format changes I’ll end up re-writing this, but it seems to work for now.

I’ve wrapped it in a function so I can just check if today is a term day or not.

function IsTermDay {
    #assume is not a term day to start
    $istermdate = $false

    $todaysdate = get-date
    #$todaysdate = [datetime]::ParseExact('28/05/2018','dd/MM/yyyy',$null)
    #write-host $todaysdate

    $url = "https://www.birmingham.gov.uk/info/20014/schools_and_learning/685/school_term_dates/1"
    $result = Invoke-WebRequest $url 
    $mydata = $result.AllElements 
    $separator = " "

    #so now we need to check todays date against all the term dates
    #if today is between outerstart and outerend, but not in the range inner start and inner end

    foreach ($element in $mydata)
    {    

    if($element.tagName -eq 'P'){
        if($element.innerHTML.StartsWith("Term")){
            $mydata = $element.innerHTML.ToString()

            $mydata = $mydata.Replace("Term Starts: ","")
            $mydata = $mydata.Replace("Half Term: ","")
            $mydata = $mydata.Replace("Term Ends: ","")
            $mydata = $mydata.Replace("
"," ")
            $mydata = $mydata.Replace("to ","")
       
            #write-host $mydata

            $splitevent = $mydata -split{$separator -contains $_}
        
            #write-host $splitevent[0]
            #write-host $splitevent[1]
            #write-host $splitevent[2]
            #write-host $splitevent[3]
            #write-host $splitevent[4]
            #write-host $splitevent[5]
            #write-host $splitevent[6]
            #write-host $splitevent[7]
            #write-host $splitevent[8]
            #write-host $splitevent[9]
            #write-host $splitevent[10]
            #write-host $splitevent[11]
            #write-host $splitevent[12]
            #write-host $splitevent[13]
            #write-host $splitevent[14]
            #write-host $splitevent[15]

            $outerstartdate = [datetime]::ParseExact($splitevent[1] + "-" + $splitevent[2] + "-" + $splitevent[3], "d-MMMM-yyyy", $null)
            $outerenddate = [datetime]::ParseExact($splitevent[13] + "-" + $splitevent[14] + "-" + $splitevent[15], "d-MMMM-yyyy", $null)

            $innerstartdate = [datetime]::ParseExact($splitevent[5] + "-" + $splitevent[6] + "-" + $splitevent[7], "d-MMMM-yyyy", $null)
            $innerenddate = [datetime]::ParseExact($splitevent[9] + "-" + $splitevent[10] + "-" + $splitevent[11], "d-MMMM-yyyy", $null)

            #write-host $outerstartdate
            #write-host $outerenddate

            #write-host $innerstartdate
            #write-host $innerenddate

            if($todaysdate -ge $outerstartdate -and $todaysdate -le $outerenddate){
                #write-host "in this outer range"
                $istermdate = $true
                if($todaysdate -ge $innerstartdate -and $todaysdate -le $innerenddate){
                    #write-host "in this inner range"
                    $istermdate = $false
                    }
                }

            }
        }

    }

    return $istermdate
}


if(IsTermDay){
    write-host "Get up early"
}else{
    write-host "Stay in bed a bit later today"
}

I can now add this to my master script and change schedules for wake up lights etc. based on the result.