python出现RuntimeError错误问题及解决

更新时间：2022年05月23日 10:29:41 作者：舔狗一无所有

这篇文章主要介绍了python出现RuntimeError错误问题及解决方案，具有很好的参考价值，希望对大家有所帮助。如有错误或未考虑完全的地方，望不吝赐教

下面是出现的错误解释

RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:

if __name__ == '__main__':
freeze_support()
...

The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.

下面是出现错误代码的原代码

import multiprocessing as mp
import time
from urllib.request import urlopen,urljoin
from bs4 import BeautifulSoup
import re
 
base_url = "https://morvanzhou.github.io/"
 
#crawl爬取网页
def crawl(url):
    response = urlopen(url)
    time.sleep(0.1)
    return response.read().decode()
 
#parse解析网页
def parse(html):
    soup = BeautifulSoup(html,'html.parser')
    urls = soup.find_all('a',{"href":re.compile('^/.+?/$')})
    title = soup.find('h1').get_text().strip()
    page_urls = set([urljoin(base_url,url['href'])for url in urls])
    url = soup.find('meta',{'property':"og:url"})['content']
    return title,page_urls,url
 
unseen = set([base_url])
seen = set()
restricted_crawl = True
 
pool = mp.Pool(4)
count, t1 = 1, time.time()
while len(unseen) != 0:                 # still get some url to visit
    if restricted_crawl and len(seen) > 20:
            break
    print('\nDistributed Crawling...')
    crawl_jobs = [pool.apply_async(crawl, args=(url,)) for url in unseen]
    htmls = [j.get() for j in crawl_jobs]      # request connection
 
    print('\nDistributed Parsing...')
    parse_jobs = [pool.apply_async(parse, args=(html,)) for html in htmls]
    results = [j.get() for j in parse_jobs]    # parse html
 
    print('\nAnalysing...')
    seen.update(unseen)         # seen the crawled
    unseen.clear()              # nothing unseen
 
    for title, page_urls, url in results:
        print(count, title, url)
        count += 1
        unseen.update(page_urls - seen)     # get new url to crawl
print('Total time: %.1f s' % (time.time()-t1))    # 16 s !!!

这是修改后的正确代码

import multiprocessing as mp
import time
from urllib.request import urlopen,urljoin
from bs4 import BeautifulSoup
import re
 
base_url = "https://morvanzhou.github.io/"
 
#crawl爬取网页
def crawl(url):
    response = urlopen(url)
    time.sleep(0.1)
    return response.read().decode()
 
#parse解析网页
def parse(html):
    soup = BeautifulSoup(html,'html.parser')
    urls = soup.find_all('a',{"href":re.compile('^/.+?/$')})
    title = soup.find('h1').get_text().strip()
    page_urls = set([urljoin(base_url,url['href'])for url in urls])
    url = soup.find('meta',{'property':"og:url"})['content']
    return title,page_urls,url
 
def main():
    unseen = set([base_url])
    seen = set()
    restricted_crawl = True
 
    pool = mp.Pool(4)
    count, t1 = 1, time.time()
    while len(unseen) != 0:                 # still get some url to visit
        if restricted_crawl and len(seen) > 20:
                break
        print('\nDistributed Crawling...')
        crawl_jobs = [pool.apply_async(crawl, args=(url,)) for url in unseen]
        htmls = [j.get() for j in crawl_jobs]      # request connection
 
        print('\nDistributed Parsing...')
        parse_jobs = [pool.apply_async(parse, args=(html,)) for html in htmls]
        results = [j.get() for j in parse_jobs]    # parse html
 
        print('\nAnalysing...')
        seen.update(unseen)         # seen the crawled
        unseen.clear()              # nothing unseen
 
        for title, page_urls, url in results:
            print(count, title, url)
            count += 1
            unseen.update(page_urls - seen)     # get new url to crawl
    print('Total time: %.1f s' % (time.time()-t1))    # 16 s !!!
 
 
if __name__ == '__main__':
    main()

综上可知，就是把你的运行代码整合成一个函数，然后加入

if __name__ == '__main__':
    main()

这行代码即可解决这个问题。

python报错:RuntimeError

python报错：RuntimeError:fails to pass a sanity check due to a bug in the windows runtime这种类型的错误

这种错误原因

1.当前的python与numpy版本之间有什么问题，比如我自己用的python3.9与numpy1.19.4会导致这种报错。

2.numpy1.19.4与当前很多python版本都有问题。

解决办法

在File->Settings->Project:pycharmProjects->Project Interpreter下将numpy版本降下来就好了。

1.打开interpreter，如下图：

第一步

2.双击numpy修改其版本：

在这里插入图片描述

3.勾选才能修改版本，将需要的低版本导入即可：

第三步

弄完了之后，重新运行就好。

以上为个人经验，希望能给大家一个参考，也希望大家多多支持脚本之家。

您可能感兴趣的文章:

pandas多级分组实现排序的方法
下面小编就为大家分享一篇pandas多级分组实现排序的方法，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧
2018-04-04
Python的Scrapy框架解析
这篇文章主要为大家介绍了Python的Scrapy框架解析，具有一定的参考价值，感兴趣的小伙伴们可以参考一下，希望能够给你带来帮助
2021-12-12
python基础入门学习笔记（Python环境搭建）
这篇文章主要介绍了python基础入门学习笔记，这是开启学习python基础知识的第一篇，夯实Python基础，才能走的更远，感兴趣的小伙伴们可以参考一下
2016-01-01
浅析Python 引号、注释、字符串
这篇文章主要介绍了Python 引号、注释、字符串的相关知识，文中给大家提到了python中一对单引号,一对双引号,三个单双引号的区别和用法,需要的朋友可以参考下
2019-07-07
python实现JAVA源代码从ANSI到UTF-8的批量转换方法
这篇文章主要介绍了python实现JAVA源代码从ANSI到UTF-8的批量转换方法,涉及Python针对文件操作与编码转换的相关技巧,需要的朋友可以参考下
2015-08-08
python 消费 kafka 数据教程
今天小编就为大家分享一篇python 消费 kafka 数据教程，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧
2019-12-12
python 基于dlib库的人脸检测的实现
这篇文章主要介绍了python 基于dlib库的人脸检测的实现，文中通过示例代码介绍的非常详细，对大家的学习或者工作具有一定的参考学习价值，需要的朋友们下面随着小编来一起学习学习吧
2019-11-11
Java 超详细讲解核心类Spring JdbcTemplate
JdbcTemplate JdbcTemplate是Spring JDBC核心包（core）中的核心类，它可以通过配置文件、注解、Java 配置类等形式获取数据库的相关信息，实现了对JDBC开发过程中的驱动加载、连接的开启和关闭、SQL语句的创建与执行、异常处理、事务处理、数据类型转换等操作的封装
2022-04-04
Python使用pyh生成HTML文档的方法示例
这篇文章主要介绍了Python使用pyh生成HTML文档的方法示例,小编觉得挺不错的，现在分享给大家，也给大家做个参考。一起跟随小编过来看看吧
2018-03-03
python中zip()函数遍历多个列表方法
在本篇文章里小编给大家整理的是一篇关于python中zip()函数遍历多个列表方法，对此有兴趣的朋友们可以学习下。
2021-02-02