Skip to content

isso: migrate: Handle deleted comments in Disqus migration#994

Merged
jelmer merged 5 commits intoisso-comments:masterfrom
pkvach:fix/disqus-migration-deleted-comments
Mar 18, 2024
Merged

isso: migrate: Handle deleted comments in Disqus migration#994
jelmer merged 5 commits intoisso-comments:masterfrom
pkvach:fix/disqus-migration-deleted-comments

Conversation

@pkvach
Copy link
Copy Markdown
Contributor

@pkvach pkvach commented Feb 27, 2024

Checklist

  • All new and existing tests are passing
  • I have added an entry to CHANGES.rst because this is a user-facing change or an important bugfix
  • I have written proper commit message(s)

What changes does this Pull Request introduce?

  • isso/migrate.py: Improved handling of deleted comments during Disqus migration by ensuring that deleted comments are imported with an empty message text, enhancing the robustness of the migration process.
  • isso/utils/html.py: Add conditional check for None before Markdown conversion in HTML utility, improving the system's resilience to null input values.
  • Updated unit tests, ensuring that the changes are correctly implemented.

Why is this necessary?

Fixes #979

@jelmer
Copy link
Copy Markdown
Member

jelmer commented Mar 10, 2024

please complete the checklist in the PR template

@pkvach
Copy link
Copy Markdown
Contributor Author

pkvach commented Mar 10, 2024

Thank you for the review.
I apologize if I got it wrong, as I'm doing my first PRs here.
Added an entry to CHANGES.rst

@jelmer
Copy link
Copy Markdown
Member

jelmer commented Mar 16, 2024

I don't have much background on disqus, but what's the point of still importing these comments but then making isso ignore them? Couldn't we just filter out the deleted comments during import or is there some point in keeping them?

@pkvach
Copy link
Copy Markdown
Contributor Author

pkvach commented Mar 17, 2024

During migration, deleted comments are ignored if they have no child comments.
But in case like #979, the deleted comment has children. And then we have to migrate it.
This follows the same logic as in the main system: a comment is not completely deleted if it has children.

@jelmer
Copy link
Copy Markdown
Member

jelmer commented Mar 17, 2024

Is setting the comment text to null the right thing to do in that case though, or we just set it to an empty string at import time (and change the schema to disallow null) ?

@pkvach
Copy link
Copy Markdown
Contributor Author

pkvach commented Mar 17, 2024

In the main scenario, if a comment needs to be marked as deleted and not removed from the database, its text is set to an empty string.

self.db.execute('UPDATE comments SET text=? WHERE id=?', ('', id))

But in the Disqus migration null was set and it produced an error when processing comments for output:

2023-11-30 15:22:04,646 ERROR: GET /
Traceback (most recent call last):
  File "isso/__init__.py", line 152, in dispatch
    response = handler(request.environ, request, **values)
  File "isso/views/__init__.py", line 43, in dec
    return func(cls, env, req, *args, **kwargs)
  File "isso/views/comments.py", line 894, in fetch
    'replies': self._process_fetched_list(root_list, plain),
  File "isso/views/comments.py", line 953, in _process_fetched_list
    item['text'] = self.isso.render(item['text'])
  File "isso/__init__.py", line 129, in render
    return self.markup.render(text)
  File "isso/utils/html.py", line 104, in render
    return self._render(text)
  File "isso/utils/html.py", line 101, in <lambda>
    self._render = lambda text: sanitizer.sanitize(parser(text))
  File "isso/utils/html.py", line 58, in inner
    rv = md(text).rstrip("\n")
  File "misaka/api.py", line 176, in __call__
    lib.hoedown_buffer_puts(ib, text.encode('utf-8'))
AttributeError: 'NoneType' object has no attribute 'encode'

@jelmer
Copy link
Copy Markdown
Member

jelmer commented Mar 17, 2024

Right, what I'm saying is - why not fix this in the disqus migration by having it set message to an empty string when it is null?

@pkvach
Copy link
Copy Markdown
Contributor Author

pkvach commented Mar 18, 2024

Sorry if I misunderstood you, but that's what I made the changes to migrate.py for. If the comment text is not specified, set it to an empty string.
Changes in html.py to prevent crashing if comment text is null. This is like an extra check in case such comments are already written to the database or will be written in other cases.

@jelmer
Copy link
Copy Markdown
Member

jelmer commented Mar 18, 2024

Sorry if I misunderstood you, but that's what I made the changes to migrate.py for. If the comment text is not specified, set it to an empty string. Changes in html.py to prevent crashing if comment text is null. This is like an extra check in case such comments are already written to the database or will be written in other cases.

Yeah, it's the change in html.py that I'm objecting to. We should rather just initialize the database to mandate that text can't be null rather than adjusting the application to cope with it.

@pkvach
Copy link
Copy Markdown
Contributor Author

pkvach commented Mar 18, 2024

Yeah, it's the change in html.py that I'm objecting to. We should rather just initialize the database to mandate that text can't be null rather than adjusting the application to cope with it.

That's a good point, thank you.
I reverted the changes on html.py and added NOT NULL constraint for the text field when creating the comment table.
Do you think we need to add some sort of migration for existing databases?

@jelmer
Copy link
Copy Markdown
Member

jelmer commented Mar 18, 2024

Yeah, it's the change in html.py that I'm objecting to. We should rather just initialize the database to mandate that text can't be null rather than adjusting the application to cope with it.

That's a good point, thank you. I reverted the changes on html.py and added NOT NULL constraint for the text field when creating the comment table. Do you think we need to add some sort of migration for existing databases?

That'd be a nice addition (in isso/db.py) but I don't think it's strictly necessary

@jelmer jelmer merged commit 6dffaa7 into isso-comments:master Mar 18, 2024
@pkvach pkvach deleted the fix/disqus-migration-deleted-comments branch March 18, 2024 20:09
@ix5
Copy link
Copy Markdown
Member

ix5 commented Apr 13, 2024

Apologies for the late comment. It appears to me that since this PR changes the DB schema in 6dffaa7#diff-115e97a46b3ca0f35e7455d6e624550eff6fe684c3354f42545d06b874eba433R41 we ought to update the internal database version and add a migration clause:

isso/isso/db/__init__.py

Lines 66 to 123 in 6dffaa7

def migrate(self, to):
if self.version >= to:
return
logger.info("migrate database from version %i to %i", self.version, to)
# re-initialize voters blob due a bug in the bloomfilter signature
# which added older commenter's ip addresses to the current voters blob
if self.version == 0:
from isso.utils import Bloomfilter
bf = memoryview(Bloomfilter(iterable=["127.0.0.0"]).array)
with sqlite3.connect(self.path) as con:
con.execute('UPDATE comments SET voters=?', (bf, ))
con.execute('PRAGMA user_version = 1')
logger.info("%i rows changed", con.total_changes)
# move [general] session-key to database
if self.version == 1:
with sqlite3.connect(self.path) as con:
if self.conf.has_option("general", "session-key"):
con.execute('UPDATE preferences SET value=? WHERE key=?', (
self.conf.get("general", "session-key"), "session-key"))
con.execute('PRAGMA user_version = 2')
logger.info("%i rows changed", con.total_changes)
# limit max. nesting level to 1
if self.version == 2:
def first(rv):
return list(map(operator.itemgetter(0), rv))
with sqlite3.connect(self.path) as con:
top = first(con.execute(
"SELECT id FROM comments WHERE parent IS NULL").fetchall())
flattened = defaultdict(set)
for id in top:
ids = [id, ]
while ids:
rv = first(con.execute(
"SELECT id FROM comments WHERE parent=?", (ids.pop(), )))
ids.extend(rv)
flattened[id].update(set(rv))
for id in flattened.keys():
for n in flattened[id]:
con.execute(
"UPDATE comments SET parent=? WHERE id=?", (id, n))
con.execute('PRAGMA user_version = 3')
logger.info("%i rows changed", con.total_changes)

@pkvach
Copy link
Copy Markdown
Contributor Author

pkvach commented Apr 14, 2024

@ix5
Thank you for the review.
Should I make a new PR to add changes on migration?

@ix5
Copy link
Copy Markdown
Member

ix5 commented Apr 23, 2024

@ix5 Thank you for the review. Should I make a new PR to add changes on migration?

Yes please!

pkvach added a commit to pkvach/isso that referenced this pull request Apr 27, 2024
…migration

This update introduces a schema migration to version 4 for the database, focusing on enhancing the 'comments' table. This ensures that the 'text' field in the 'comments' table will always have a value, which improves data consistency and integrity.

See:
- isso-comments#979
- isso-comments#994
pkvach added a commit to pkvach/isso that referenced this pull request Apr 27, 2024
…migration

This update introduces a schema migration to version 4 for the database, focusing on enhancing the 'comments' table. This ensures that the 'text' field in the 'comments' table will always have a value, which improves data consistency and integrity.

See:
- isso-comments#979
- isso-comments#994
@pkvach
Copy link
Copy Markdown
Contributor Author

pkvach commented Apr 27, 2024

@ix5 Thank you for the review. Should I make a new PR to add changes on migration?

Yes please!

Added a PR with migration: #1019

pkvach added a commit to pkvach/isso that referenced this pull request May 2, 2024
…migration

This update introduces a schema migration to version 4 for the database, focusing on enhancing the 'comments' table. This ensures that the 'text' field in the 'comments' table will always have a value, which improves data consistency and integrity.

See:
- isso-comments#979
- isso-comments#994
pkvach added a commit to pkvach/isso that referenced this pull request May 4, 2024
…migration

This update introduces a schema migration to version 4 for the database, focusing on enhancing the 'comments' table. This ensures that the 'text' field in the 'comments' table will always have a value, which improves data consistency and integrity.

See:
- isso-comments#979
- isso-comments#994
pkvach added a commit to pkvach/isso that referenced this pull request May 4, 2024
…migration

This update introduces a schema migration to version 4 for the database, focusing on enhancing the 'comments' table. This ensures that the 'text' field in the 'comments' table will always have a value, which improves data consistency and integrity.

See:
- isso-comments#979
- isso-comments#994
ix5 pushed a commit to pkvach/isso that referenced this pull request May 5, 2024
…migration

This update introduces a schema migration to version 4 for
the database, focusing on enhancing the 'comments' table.
This ensures that the 'text' field in the 'comments' table
will always have a value, which improves data consistency
and integrity.

See:
- isso-comments#979
- isso-comments#994
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NULL text from deleted comment results in error 500

3 participants