Index 生成爆炸debug记录

这个bug真的耗费了我巨大的时间,它的表现是在index.log出现非常多的形如如下的报错

11/05/2019 20:28:42 [WARNING] seafes:110 thread_task: Request Error: TransportError(400, u'action_request_validation_exception', u'Validation Failed: 1: id is too long, must be no longer than 512 bytes but was: 532;')
Traceback (most recent call last):
  File "/opt/seafile/seafile-pro-server-7.0.9/pro/python/seafes/index_local.py", line 105, in thread_task
    self.fileindexupdater.update_repo(repo_id, commit_id)
  File "/opt/seafile/seafile-pro-server-7.0.9/pro/python/seafes/file_index_updater.py", line 80, in update_repo
    self.check_recovery(repo_id)
  File "/opt/seafile/seafile-pro-server-7.0.9/pro/python/seafes/file_index_updater.py", line 76, in check_recovery
    self.update_files_index(repo_id, old, new)
  File "/opt/seafile/seafile-pro-server-7.0.9/pro/python/seafes/file_index_updater.py", line 64, in update_files_index
    self.files_index.add_files(repo_id, version, added_files)
  File "/opt/seafile/seafile-pro-server-7.0.9/pro/python/seafes/indexes/repo_files.py", line 125, in add_files
    self.add_file_to_index(repo_id, version, path, obj_id, mtime, size)
  File "/opt/seafile/seafile-pro-server-7.0.9/pro/python/seafes/indexes/repo_files.py", line 171, in add_file_to_index
    id=_doc_id(repo_id, path))
  File "/opt/seafile/seafile-pro-server-7.0.9/pro/python/elasticsearch/client/utils.py", line 73, in _wrapped
    return func(*args, params=params, **kwargs)
  File "/opt/seafile/seafile-pro-server-7.0.9/pro/python/elasticsearch/client/__init__.py", line 300, in index
    _make_path(index, doc_type, id), params=params, body=body)
  File "/opt/seafile/seafile-pro-server-7.0.9/pro/python/elasticsearch/transport.py", line 312, in perform_request
    status, headers, data = connection.perform_request(method, url, params, body, ignore=ignore, timeout=timeout)
  File "/opt/seafile/seafile-pro-server-7.0.9/pro/python/elasticsearch/connection/http_urllib3.py", line 128, in perform_request
    self._raise_error(response.status, raw_data)
  File "/opt/seafile/seafile-pro-server-7.0.9/pro/python/elasticsearch/connection/base.py", line 125, in _raise_error
    raise HTTP_EXCEPTIONS.get(status_code, TransportError)(status_code, error_message, additional_info)
RequestError: TransportError(400, u'action_request_validation_exception', u'Validation Failed: 1: id is too long, must be no longer than 512 bytes but was: 532;')

这个是由于elasticsearch在生成搜索索引时有的文件的绝对路径长度过长导致的, linux限制的路径长度应该是512的,但是实际上在生成index的时候,会再加上前缀,而docker版的elasticsearch目前用的是长度限制为512,虽然好像新版elasticsearch有解决长度限制问题,但是官方的docker使用的似乎并不是新版的

由于默认elasticsearch是10分钟更新一次,然后就会不停爆炸,导致CPU负载极高

解决方法是

  1. 删掉文件名过长的文件
  2. 进入seafile容器
  3. cd seafile-pro-server-7.0.9
  4. ./pro/pro.py search --clear
  5. ./pro/pro.py search --update

算了一下,我这个路径超长的文件,从资料库算起的路径长度为359,报错显示超长了20,所以路径名尽量限制在339以内才算安全

我是怎么找到这个BUG的呢:通过错误栈找到文件/opt/seafile/seafile-pro-server-7.0.9/pro/python/elasticsearch/connection/http_urllib3.py,然后在126行塞入了如下代码,然后就会输出错误文件

        if not (200 <= response.status < 300) and response.status not in ignore:
            # FGN DEBUG
            with open("/fgn_debug_file",'w') as fgn_debug_file:
                # print(method, full_url, url, body, duration, response.status, raw_data)
                print(method)
                print(full_url)
                print(url)
                print(body)
                print(duration)
                print(response.status)
                print(raw_data)
概要

本文本将被隐藏