Skip to content

Commit 359a56a

Browse files
authored
Merge pull request #17 from NHSDigital/websitecheckerupdate
updated github action for ease of use
2 parents 4070b1c + 975753d commit 359a56a

7 files changed

Lines changed: 75 additions & 110 deletions

File tree

.github/workflows/errorChecker.yml

Lines changed: 0 additions & 29 deletions
This file was deleted.

.github/workflows/linkchecker.yml

Lines changed: 0 additions & 28 deletions
This file was deleted.

.github/workflows/spellChecker.yml

Lines changed: 0 additions & 36 deletions
This file was deleted.
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
---
2+
name: Simplifier IG Website Checking
3+
on:
4+
# Allows you to run this workflow manually from the Actions tab
5+
workflow_dispatch:
6+
inputs:
7+
websiteurl:
8+
default: "https://simplifier.net/guide/uk-core-implementation-guidance-directory?version=current"
9+
jobs:
10+
job1:
11+
name: html error checker
12+
runs-on: ubuntu-latest
13+
steps:
14+
- name: Checkout repo content
15+
uses: actions/checkout@v3
16+
- name: Set up python
17+
uses: actions/setup-python@v4
18+
with:
19+
python-version: 3.x
20+
- name: Install dependencies
21+
run: |
22+
python -m pip install --upgrade pip
23+
pip install -r ./IGPageContentValidator/requirements.txt
24+
- name: Execute HTML Error Check
25+
run: INPUT_STORE=${{ github.event.inputs.websiteurl }} python ./IGPageContentValidator/errorChecker.py
26+
job2:
27+
name: url link checker
28+
runs-on: ubuntu-latest
29+
steps:
30+
- name: checkout repo content
31+
uses: actions/checkout@v3
32+
- name: Install dependencies
33+
run: |
34+
sudo apt install python3-bs4 python3-dnspython python3-requests
35+
pip3 install linkchecker
36+
- name: Check input link is valid
37+
run: >
38+
echo 'exit codes can be found at
39+
https://everything.curl.dev/usingcurl/returns'
40+
41+
curl ${{ github.event.inputs.websiteurl }} -s -f -o /dev/null
42+
- name: Execute Link Check
43+
run: >
44+
linkchecker -r 2 --check-extern --no-status -f
45+
./IGPageContentValidator/linkcheckerrc ${{ github.event.inputs.websiteurl }} || test $? = 1;
46+
job3:
47+
name: spell checker
48+
runs-on: ubuntu-latest
49+
steps:
50+
- name: checkout repo content
51+
uses: actions/checkout@v3
52+
- name: Set up python
53+
uses: actions/setup-python@v4
54+
with:
55+
python-version: 3.x
56+
- name: Install dependencies
57+
run: |
58+
sudo apt install aspell
59+
python -m pip install --upgrade pip
60+
pip install -r ./IGPageContentValidator/requirements.txt
61+
- name: execute relToAbsLinks.py
62+
run: INPUT_STORE=${{ github.event.inputs.websiteurl }} python ./IGPageContentValidator/relToAbsLinks.py
63+
64+
- name: Execute Spell Check
65+
run: cat OutputLinks.txt | while read p; do wget -nv -O - $p | aspell list -H --camel-case --lang en_GB --add-html-skip=nocheck -p ./IGPageContentValidator/.aspell.en.pws |sort| uniq -c; echo -e '\n'; done;

IGPageContentValidator/README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Simplifier Implementation Guide Page Content Validation
22

3-
The validator works by scraping the webpage within website.txt for any internal webpage links within the Simplifier Guide. These webpages are then validated individually.
3+
The validator works by scraping the webpage for any internal webpage links within the Simplifier Guide. These webpages are then validated individually.
44

55
The website validation is in three parts:
66
- HTML Error Checking - This checks each page for any html errors. This captures any errors caused by using Simplifier relative links, e.g `{{pagelink: }}`, amongst the usual coding errors.
@@ -9,13 +9,13 @@ The website validation is in three parts:
99

1010
## Instructions
1111

12-
1. Edit the file `website.txt` ensuring the website you want scraped is entered on the first line. Note: Only Simplifier.net guides will work with this checker.
13-
2. Click the `Actions` button. the top 3 actions will be the individual checkers needed. Wait until there is a green tick next to each.
14-
3. Within each Action click the `Build` button
15-
4. Within the Build click the following for the results:
16-
- HTML Error Check
17-
- Link Check
18-
- Spell Check
12+
1. Go to [Actions..websiteChecker](https://github.com/NHSDigital/IOPS-FHIR-Test-Scripts/actions/workflows/websiteChecker.yml)
13+
2. Click `Run workflow`.
14+
3. Enter the website url into the `websiteurl` box and click `Run workflow`.
15+
4. Click on the action and then click on the following for the results:
16+
- html error checker
17+
- link checker
18+
- spell checker
1919

2020
## HTML Error Checking
2121
Uses the errorChecker.py script. Checks for any html errors on a website using BeautifulSoup's `find_all('div',{'class':"error"})`. This returns the errors for each individual page.

IGPageContentValidator/linkScraper.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,9 @@
55

66
from bs4 import BeautifulSoup # this module helps in web scrapping.
77
import requests # this module helps us to download a web page
8+
import os
89

9-
with open('./IGPageContentValidator/website.txt', 'r') as file:
10-
data = file.readline().strip('\n')
10+
data = os.environ['INPUT_STORE']
1111

1212
'''returns html page of link within website.txt'''
1313
def RequestData(url):

IGPageContentValidator/website.txt

Lines changed: 0 additions & 7 deletions
This file was deleted.

0 commit comments

Comments
 (0)