Skip to content

Commit 68f7677

Browse files
tiosgzGuts
authored andcommitted
fix retrieving article description
What made me start this change were two logic bugs: chars_count being checked against -1 (instead of None) and abstract_delimiter taking higher priority than unlimited chars_count. Apart from fixing the above, this commit reorders the if-elif chain according to priority and, in the documentation, mentions how abstract_chars_count and abstract_delimiter interact. Breaking changes, as far as i'm aware: - abstract_chars_count == -1 is again highest-priority, - more effort (using abstract_delimiter) before non-compliance, - chars_count is now an inclusive setting, increasing the article length before it has to be shortened by one character. None of these changes conflicts with the pre-existing documentation and none should be noticeable in a well-configured environment.
1 parent a08052a commit 68f7677

2 files changed

Lines changed: 26 additions & 32 deletions

File tree

docs/configuration.md

Lines changed: 6 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -240,26 +240,23 @@ Output:
240240

241241
### `abstract_chars_count`: item description length
242242

243-
To fill each [item description element](https://www.w3schools.com/xml/rss_tag_title_link_description_item.asp):
243+
Used, in combination with `abstract_delimiter`, to determine each [item description element](https://www.w3schools.com/xml/rss_tag_title_link_description_item.asp):
244244

245245
- If this value is set to `-1`, then the articles' full HTML content will be filled into the description element.
246-
- be careful: if set to `0` and there is no description, the feed's compliance is broken (an item must have a description)
247246
- Otherwise, the plugin first tries to retrieve the value of the keyword `description` from the [page metadata].
248-
- If the value is non-negative and no `description` meta is found, then the plugin retrieves the first number of characters of the page content defined by this setting. Retrieved content is the raw markdown converted roughly into HTML.
247+
- If that fails and `abstract_delimiter` is found in the page, the article content up to (but not including) the delimiter is used.
248+
- If the above has failed, then the plugin retrieves the first number of characters of the page content defined by this setting. Retrieved content is the raw markdown converted roughly into HTML.
249+
- Be careful: if set to `0` and there is no description, the feed's compliance is broken (an item must have a description).
249250

250251
`abstract_chars_count`: number of characters to use as item description.
251252

252253
Default: `150`
253254

254255
----
255256

256-
#### `abstract_delimiter`: abstract delimiter
257+
### `abstract_delimiter`: abstract delimiter
257258

258-
Used to fill each [item description element](https://www.w3schools.com/xml/rss_tag_title_link_description_item.asp):
259-
260-
- If this value is set to `-1`, then the full HTML content will be filled into the description element.
261-
- Otherwise, the plugin first tries to retrieve the value of the key `description` from the page metadata.
262-
- If the value is non-negative and no `description` meta is found, then the plugin retrieves the first number of characters of the page content defined by this setting. Retrieved content is the raw markdown converted rougthly into HTML (i.e. without extension, etc.).
259+
Please see `abstract_chars_count` for how this setting is used.
263260

264261
`abstract_delimiter`: string to mark .
265262

mkdocs_rss_plugin/util.py

Lines changed: 20 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -472,18 +472,16 @@ def get_description_or_abstract(
472472
if chars_count < 0:
473473
chars_count = None
474474

475-
# If the abstract chars is not unlimited and the description exists,
476-
# return the description.
477-
if description and chars_count is not None:
475+
# If the full page is wanted (unlimited chars count)
476+
if chars_count is None and (in_page.content or in_page.markdown):
477+
if in_page.content:
478+
return in_page.content
479+
else:
480+
return markdown.markdown(in_page.markdown, output_format="html5")
481+
# If the description is explicitly given
482+
elif description:
478483
return description
479-
# If no description and chars_count set to 0, return empty string
480-
elif not description and chars_count == 0:
481-
logger.warning(
482-
f"No description set for page {in_page.file.src_uri} "
483-
"and 'abstract_chars_count' set to 0. The feed won't be compliant, "
484-
"because an item must have a description."
485-
)
486-
return ""
484+
# If the description is cut by the delimiter
487485
elif (
488486
abstract_delimiter
489487
and (
@@ -495,24 +493,23 @@ def get_description_or_abstract(
495493
in_page.markdown[:excerpt_separator_position],
496494
output_format="html5",
497495
)
498-
# If chars count is unlimited, use the html content
499-
elif in_page.content and chars_count == -1:
500-
if chars_count is None or len(in_page.content) < chars_count:
501-
return in_page.content[:chars_count]
502-
# Use markdown
503-
elif in_page.markdown:
504-
if chars_count is None or len(in_page.markdown) < chars_count:
505-
return markdown.markdown(
506-
in_page.markdown[:chars_count], output_format="html5"
507-
)
496+
# Use first chars_count from the markdown
497+
elif chars_count > 0 and in_page.markdown:
498+
if len(in_page.markdown) <= chars_count:
499+
return markdown.markdown(in_page.markdown, output_format="html5")
508500
else:
509501
return markdown.markdown(
510502
f"{in_page.markdown[: chars_count - 3]}...",
511503
output_format="html5",
512504
)
513-
# Unlimited chars_count but no content is found, then return the description.
505+
# No explicit description and no content is found
514506
else:
515-
return description if description else ""
507+
logger.warning(
508+
f"No description set for page {in_page.file.src_uri} "
509+
"and 'abstract_chars_count' set to 0. The feed won't be compliant, "
510+
"because an item must have a description."
511+
)
512+
return ""
516513

517514
def get_image(self, in_page: Page, base_url: str) -> Optional[Tuple[str, str, int]]:
518515
"""Get page's image from page meta or social cards and returns properties.

0 commit comments

Comments
 (0)