Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL error cause by page_name too long when Sync Notion pages #14314

Open
5 tasks done
xlight opened this issue Feb 25, 2025 · 1 comment
Open
5 tasks done

SQL error cause by page_name too long when Sync Notion pages #14314

xlight opened this issue Feb 25, 2025 · 1 comment
Labels
🐞 bug Something isn't working good first issue Good first issue for newcomers

Comments

@xlight
Copy link

xlight commented Feb 25, 2025

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

0.15.3

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

create a new Knowlege, and connect to notion,
sync notion pages,
save & process

✔️ Expected Behavior

Knowlege created success.

❌ Actual Behavior

http request error

API URL: /console/api/datasets/05bad57c-64be-4829-88d8-b424d04b8468/documents

error message

{"message": "Internal Server Error", "code": "unknown"}

service log

 ERROR [Dummy-4] [app.py:875] - Exception on /console/api/datasets/05baxxxxxxx468/documents [POST]
Traceback (most recent call last):
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
    self.dialect.do_execute(
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 941, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.StringDataRightTruncation: value too long for type character varying(255)

The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  
  File "/app/api/.venv/lib/python3.12/site-packages/flask_restful/__init__.py", line 696, in wrapper
    resp = f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^
  File "/app/api/controllers/console/wraps.py", line 85, in decorated
    return view(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/controllers/console/datasets/datasets_document.py", line 275, in post
    documents, batch = DocumentService.save_document_with_dataset_id(dataset, knowledge_config, current_user)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/services/dataset_service.py", line 976, in save_document_with_dataset_id
    db.session.flush()
  
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/base.py", line 1967, in _exec_single_context
    self.dialect.do_execute(
  File "/app/api/.venv/lib/python3.12/site-packages/sqlalchemy/engine/default.py", line 941, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.DataError: (psycopg2.errors.StringDataRightTruncation) value too long for type character varying(255)
[SQL: INSERT INTO documents (tenant_id, dataset_id, position, data_source_type, data_source_info, dataset_process_rule_id, batch, name, created_from, created_by, created_api_request_id, processing_started_at, file_id, word_count, parsing_completed_at, cleaning_completed_at, splitting_completed_at, tokens, indexing_latency, completed_at, paused_by, paused_at, error, stopped_at, disabled_at, disabled_by, archived_reason, archived_by, archived_at, doc_type, doc_form, doc_language) VALUES (%(tenant_id)s::UUID, %(dataset_id)s::UUID, %(position)s, %(data_source_type)s, %(data_source_info)s, %(dataset_process_rule_id)s::UUID, %(batch)s, %(name)s, %(created_from)s, %(created_by)s::UUID, %(created_api_request_id)s::UUID, %(processing_started_at)s, %(file_id)s, %(word_count)s, %(parsing_completed_at)s, %(cleaning_completed_at)s, %(splitting_completed_at)s, %(tokens)s, %(indexing_latency)s, %(completed_at)s, %(paused_by)s::UUID, %(paused_at)s, %(error)s, %(stopped_at)s, %(disabled_at)s, %(disabled_by)s::UUID, %(archived_reason)s, %(archived_by)s::UUID, %(archived_at)s, %(doc_type)s, %(doc_form)s, %(doc_language)s) RETURNING documents.id, documents.created_at, documents.is_paused, documents.indexing_status, documents.enabled, documents.archived, documents.updated_at]
[parameters: {'tenant_id': '1c49xxxxxx61fcff53', 'dataset_id': '05xxxxxx468', 'position': 6, 'data_source_type': 'notion_import', 'data_source_info': '{"notion_workspace_id": "1c493be2-ac7e-4810-a84d-e9a161fcff53", "notion_page_id": "6633bxxxx302044", "notion_page_icon": null, "type": "page"}', 'dataset_process_rule_id': 'e43dxxxxc01fdac3a', 'batch': '2025xxxx1160009', 'name': 'linuxmint/timeshift: System restore tool for Linux. Creates filesystem snapshots using rsync+hardlinks, or BTRFS snapshots. Supports scheduled snapshots, multiple backup levels, and exclude filters. Snapshots can be restored while system is running or from Live CD/USB.', 'created_from': 'web', 'created_by': 'afec66fbxxxxx96468e4', 'created_api_request_id': None, 'processing_started_at': None, 'file_id': None, 'word_count': None, 'parsing_completed_at': None, 'cleaning_completed_at': None, 'splitting_completed_at': None, 'tokens': None, 'indexing_latency': None, 'completed_at': None, 'paused_by': None, 'paused_at': None, 'error': None, 'stopped_at': None, 'disabled_at': None, 'disabled_by': None, 'archived_reason': None, 'archived_by': None, 'archived_at': None, 'doc_type': None, 'doc_form': 'text_model', 'doc_language': 'English'}]
(Background on this error at: https://sqlalche.me/e/20/9h9h)

@dosubot dosubot bot added the 🐞 bug Something isn't working label Feb 25, 2025
Copy link

dosubot bot commented Feb 25, 2025

Hey @xlight! 👋 I'm here to help you with any bugs, questions, or contributions you have. Let's squash those bugs together!

The error you're encountering is due to the 'name' field in the 'documents' table exceeding the maximum allowed length of 255 characters. This is causing a StringDataRightTruncation error when trying to insert the data into the database.

To resolve this issue, you can implement a validation or truncation mechanism for the 'name' field before attempting to insert it into the database. This can be done by ensuring that the 'name' does not exceed 255 characters. You can either truncate the name to fit within this limit or raise an error if the name is too long.

Here's a simple example of how you might truncate the name:

def truncate_name(name, max_length=255):
    return name[:max_length]

# Usage
document_name = "Your long document name here..."
document_name = truncate_name(document_name)

Make sure to apply this logic in the part of your code where the document name is being set or processed before the database insertion. This will prevent the SQL error from occurring.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

@xlight xlight changed the title SQL error cause by page.name too long when Sync Notion pages SQL error cause by page_name too long when Sync Notion pages Feb 25, 2025
xlight added a commit to xlight/dify that referenced this issue Feb 25, 2025
fix the bug: 
SQL error cause by page_name too long when Sync Notion pages
langgenius#14314
@crazywoola crazywoola added the good first issue Good first issue for newcomers label Feb 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working good first issue Good first issue for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants