Python基础指南之字符串查找与替换的常用方法详解

更新时间：2026年06月08日 08:52:42 作者：星河耀银海

Python的字符串有40多个内置方法,主要可以分为查找与替换（本文）,分割与拼接和大小写与判断,其中查找和替换是大家最常用的操作,下面小编就和大家详细讲讲吧

一、开篇：字符串方法的魅力

今天开始，我们要进入字符串方法的世界——那些帮你在字符串中查找内容、替换内容、判断内容的强大工具。

Python的字符串有40多个内置方法。别被这个数字吓到——日常开发中真正高频使用的大概15个左右。我将它们分成三组来讲：查找与替换（本文）、分割与拼接、大小写与判断。

查找和替换是你处理文本时最常用的操作——从一段文字中找关键词、替换模板中的占位符、提取特定格式的数据，全都离不开它们。

二、find() 和 rfind()：查找子串位置

2.1 find() 基本用法

find() 在字符串中查找子串，返回第一次出现的位置索引。如果找不到，返回 -1（而不是报错）。

text = 'Python is easy, Python is powerful'

# 查找子串
print(text.find('Python'))     # 0（第一次出现的位置）
print(text.find('is'))         # 7
print(text.find('Java'))       # -1（没找到）

# 指定查找范围：find(sub, start, end)
print(text.find('Python', 10))    # 17（从索引10开始找）
print(text.find('Python', 0, 10)) # -1（在0到10范围内没找到）

2.2 rfind()：从右边开始查找

text = 'Python is easy, Python is powerful'

# rfind：找最后一次出现的位置
print(text.rfind('Python'))   # 17
print(text.rfind('is'))       # 22（第二个'is'）
print(text.find('is'))        # 7（第一个'is'）

2.3 find() 实战应用

# 提取文件扩展名
def get_file_extension(filename):
    dot_index = filename.rfind('.')
    if dot_index == -1:
        return ''  # 没有扩展名
    return filename[dot_index:]

print(get_file_extension('report.pdf'))   # .pdf
print(get_file_extension('archive.tar.gz')) # .gz
print(get_file_extension('README'))       # ''


# 提取URL中的协议
def get_url_protocol(url):
    separator = url.find('://')
    if separator == -1:
        return 'unknown'
    return url[:separator]

print(get_url_protocol('https://example.com'))  # https
print(get_url_protocol('ftp://files.server'))   # ftp


# 查找所有出现的位置
def find_all_occurrences(text, substring):
    """查找子串在文本中出现的所有位置"""
    positions = []
    start = 0
    while True:
        pos = text.find(substring, start)
        if pos == -1:
            break
        positions.append(pos)
        start = pos + 1
    return positions

text = 'abababab'
print(find_all_occurrences(text, 'ab'))  # [0, 2, 4, 6]

三、index() 和 rindex()：查找但报错

和 find() 几乎一样，唯一的区别是：找不到时抛出 ValueError 而不是返回 -1。

text = 'Python编程'

# 找到了——行为完全一样
print(text.index('Python'))  # 0
print(text.index('编程'))     # 6

# 区别：找不到时报错
# print(text.index('Java'))  # ValueError: substring not found

# 安全的用法——用try-except包裹
def safe_index(text, substring):
    try:
        return text.index(substring)
    except ValueError:
        return -1

print(safe_index('Python', 'Java'))  # -1

选择 find() 还是 index() ？

大多数情况用 find()：更安全，返回-1即可判断
当子串必须存在时用 index()：找不到说明数据有问题，报错比静默失败更好

四、count()：统计出现次数

text = 'banana'

print(text.count('a'))      # 3
print(text.count('na'))     # 2
print(text.count('z'))      # 0

# 指定范围统计
print(text.count('a', 3))       # 2（从索引3开始统计）
print(text.count('a', 0, 3))    # 1（在0到3范围内统计）

# 实际应用
def count_keywords(text, keywords):
    """统计文本中各个关键词出现的次数"""
    result = {}
    for keyword in keywords:
        result[keyword] = text.lower().count(keyword.lower())
    return result

article = """
Python是一种解释型语言。Python的设计哲学强调代码的可读性。
Python支持多种编程范式，包括面向对象和函数式编程。
"""
word_counts = count_keywords(article, ['Python', '编程', '语言'])
print(word_counts)
# {'Python': 3, '编程': 2, '语言': 1}

五、startswith() 和 endswith()：首尾判断

5.1 基本用法

filename = 'report_2024.pdf'

print(filename.startswith('report'))  # True
print(filename.endswith('.pdf'))      # True
print(filename.endswith('.txt'))      # False

# 可以传入元组来匹配多个选项
print(filename.endswith(('.pdf', '.doc', '.docx')))  # True
print(filename.endswith(('.jpg', '.png', '.gif')))   # False

5.2 指定查找范围

text = 'Python Programming'

# startswith可以带start和end参数
print(text.startswith('Pro', 7))         # True（从索引7开始看）
print(text.startswith('Pro', 7, 10))     # True

# endswith也有范围参数（检查指定范围内的结尾）
print(text.endswith('ing', 0, 17))       # True（检查text[0:17]的结尾）

5.3 实战应用

# 过滤文件列表
def filter_files_by_extension(files, extensions):
    """过滤出指定扩展名的文件"""
    return [f for f in files if f.lower().endswith(extensions)]

files = ['data.csv', 'report.pdf', 'photo.jpg', 'script.py', 'notes.txt']
csv_and_py = filter_files_by_extension(files, ('.csv', '.py'))
print(csv_and_py)  # ['data.csv', 'script.py']


# 按前缀分类
def categorize_by_prefix(items, prefix_map):
    """根据前缀对项目分类"""
    categories = {key: [] for key in prefix_map}
    categories['other'] = []

    for item in items:
        categorized = False
        for prefix, category in prefix_map.items():
            if item.startswith(prefix):
                categories[category].append(item)
                categorized = True
                break
        if not categorized:
            categories['other'].append(item)

    return categories

files = ['img_001.jpg', 'doc_report.pdf', 'img_002.png', 'doc_notes.txt']
prefix_map = {'img_': 'images', 'doc_': 'documents'}
print(categorize_by_prefix(files, prefix_map))
# {'images': ['img_001.jpg', 'img_002.png'],
#  'documents': ['doc_report.pdf', 'doc_notes.txt'],
#  'other': []}

六、replace()：替换内容

6.1 基本用法

text = 'Hello, World! World is beautiful.'

# 基本替换
print(text.replace('World', 'Python'))
# Hello, Python! Python is beautiful.

# 限制替换次数
print(text.replace('World', 'Python', 1))
# Hello, Python! World is beautiful.（只替换第一个）

# 删除内容（替换为空字符串）
print(text.replace('World', ''))
# Hello, !  is beautiful.
# （注意：多了空格，可能需要额外处理）

# 替换不存在的子串——静默返回原字符串
print(text.replace('Java', 'Python'))  # 原样返回

6.2 replace() 的高级应用

# 清理文本中的多余空白
def clean_whitespace(text):
    """将连续的空白字符替换为单个空格"""
    import re
    # replace处理简单场景
    text = text.replace('\t', ' ').replace('\n', ' ')
    # 连续的多个空格替换为单个空格
    while '  ' in text:
        text = text.replace('  ', ' ')
    return text.strip()

raw_text = 'Hello    world\t\tPython   \n\n   编程'
print(clean_whitespace(raw_text))  # Hello world Python 编程


# 模板替换
def fill_template(template, **kwargs):
    """用字典的值填充模板中的占位符"""
    result = template
    for key, value in kwargs.items():
        placeholder = '{' + key + '}'
        result = result.replace(placeholder, str(value))
    return result

template = '尊敬的{name}，您的订单{order_id}已{status}。'
message = fill_template(
    template,
    name='小明',
    order_id='20240530001',
    status='发货'
)
print(message)  # 尊敬的小明，您的订单20240530001已发货。


# replace的链式调用
text = 'a b c d e'
result = text.replace('a', '1').replace('b', '2').replace('c', '3')
print(result)  # 1 2 3 d e


# 多组替换——使用字典
def multi_replace(text, replacements):
    """一次性执行多组替换"""
    for old, new in replacements.items():
        text = text.replace(old, new)
    return text

replace_map = {
    '&': '&amp;',
    '<': '&lt;',
    '>': '&gt;',
    '"': '&quot;',
    "'": '&#39;'
}
html_text = '<div class="content">Hello & Welcome</div>'
escaped = multi_replace(html_text, replace_map)
print(escaped)

七、translate() 和 maketrans()：批量字符映射

对于单字符的批量替换，translate() 比多次调用 replace() 高效得多：

# 创建字符映射表
# str.maketrans(x, y, z)
# x: 要被替换的字符
# y: 替换成的字符（与x逐一对应）
# z: 要删除的字符（可选）

# 凯撒密码——每个字母后移3位
import string
lowercase = string.ascii_lowercase  # 'abcdefghijklmnopqrstuvwxyz'
shifted = lowercase[3:] + lowercase[:3]  # 'defghijklmnopqrstuvwxyzabc'
translation_table = str.maketrans(lowercase + lowercase.upper(),
                                   shifted + shifted.upper())

message = 'Hello, Python!'
encrypted = message.translate(translation_table)
print(encrypted)  # Khoor, Sbwkrq!


# 删除所有数字和标点符号
text = '你好，Python3！这是一个测试2024。'
# 方法1：删除特定字符
digits_and_punctuation = '0123456789，。！？、：；""''（）'
translator = str.maketrans('', '', digits_and_punctuation)
cleaned = text.translate(translator)
print(cleaned)  # 你好Python这是一个测试


# 使用translate做简单的文本消毒
def sanitize_filename(filename):
    """清理文件名中的非法字符"""
    illegal_chars = r'<>:"/\|?*'
    translator = str.maketrans(illegal_chars, '_' * len(illegal_chars))
    return filename.translate(translator)

print(sanitize_filename('report:2024/05/30.pdf'))
# report_2024_05_30.pdf

什么时候用 translate() 而不是 replace()？

单字符→单字符的批量映射：translate() 快得多
多字符子串→多字符子串：只能用 replace()

八、strip()系列：去除首尾字符

8.1 strip() / lstrip() / rstrip()

# strip()：去除首尾的空白字符（默认）
text = '   Hello, World!   \n'
print(repr(text.strip()))  # 'Hello, World!'

# 去除指定的字符（不是子串，是字符集合！）
text = '###Hello, World!###'
print(text.strip('#'))     # 'Hello, World!'

text = 'www.example.com'
print(text.strip('w.moc'))  # 'example'
# 注意：strip('w.moc')是一直去除首尾出现的w、.、m、o、c这些字符
# 直到首尾不再出现这些字符为止

# lstrip()：只去除左侧
text = '   Hello   '
print(repr(text.lstrip()))  # 'Hello   '

# rstrip()：只去除右侧
text = '   Hello   '
print(repr(text.rstrip()))  # '   Hello'

8.2 strip() 实战案例

# 清理用户输入
def clean_user_input(text):
    """清理用户输入的首尾空白和特殊符号"""
    return text.strip().strip('.,;:!?。，；：！？')


inputs = ['   小明   ', '小红。。。', '  小刚,,,   ', '，小丽。']
for inp in inputs:
    print(repr(clean_user_input(inp)))
# '小明'
# '小红'
# '小刚'
# '小丽'


# 读取配置文件中的值，去除注释和空白
def parse_config_line(line):
    """解析配置文件的一行"""
    # 去除注释（#后面的内容）
    comment_index = line.find('#')
    if comment_index != -1:
        line = line[:comment_index]

    # 去除首尾空白
    return line.strip()


config_lines = [
    'server = localhost  # 数据库服务器地址',
    'port = 3306         # 数据库端口',
    'username = admin',
    ''
]
for line in config_lines:
    result = parse_config_line(line)
    if result:
        print(f'解析：{result}')

九、replace() vs re.sub()：何时用正则

replace() 只能替换确定的字符串。如果需要模式匹配（如"所有数字"、“邮箱地址”），就需要正则表达式：

import re

text = '我的电话是138-1234-5678，他的电话是139-8765-4321。'

# replace做不到——每次号码都不一样
# 用正则表达式
masked = re.sub(r'\d{3}-\d{4}-\d{4}', '***-****-****', text)
print(masked)  # 我的电话是***-****-****，他的电话是***-****-****。


# replace能做到——不需要正则
text = 'Hello, WORLD! Hello, world!'
print(text.replace('Hello', 'Hi'))  # Hi, WORLD! Hi, world!

# replace做不到——大小写不敏感的替换
print(re.sub('hello', 'Hi', text, flags=re.IGNORECASE))  # Hi, WORLD! Hi, world!

原则：能用 replace() 解决的就用 replace()，简单清晰；需要模式匹配时才用正则。

十、查找与替换的性能对比

import time

text = 'a' * 10000 + 'target' + 'a' * 10000

# find 比 in 更灵活（可以指定位置）
start = time.perf_counter()
pos = text.find('target')
elapsed = time.perf_counter() - start
print(f'find: {elapsed:.8f}秒, 位置={pos}')

# count 和 find 性能差不多
start = time.perf_counter()
cnt = text.count('a')
elapsed = time.perf_counter() - start
print(f'count: {elapsed:.8f}秒, 数量={cnt}')

# replace vs translate（字符替换场景）
chars_to_replace = 'abcdefghijklmnopqrstuvwxyz'
chars_replacement = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

# replace方式
start = time.perf_counter()
result1 = text
for old, new in zip(chars_to_replace, chars_replacement):
    result1 = result1.replace(old, new)
elapsed_replace = time.perf_counter() - start
print(f'replace: {elapsed_replace:.4f}秒')

# translate方式
start = time.perf_counter()
table = str.maketrans(chars_to_replace, chars_replacement)
result2 = text.translate(table)
elapsed_translate = time.perf_counter() - start
print(f'translate: {elapsed_translate:.4f}秒')
# translate快很多倍！

十一、本篇小结

字符串查找与替换方法速查：

方法	功能	找不到时
`find()`	查找子串位置	返回-1
`rfind()`	从右查找	返回-1
`index()`	查找子串位置	报ValueError
`rindex()`	从右查找	报ValueError
`count()`	统计出现次数	返回0
`startswith()`	判断开头	返回False
`endswith()`	判断结尾	返回False
`replace()`	替换子串	返回原字符串
`translate()`	批量字符映射	N/A
`strip()`	去除首尾字符	可能返回空串

这些方法构成了Python字符串处理的基础。日常开发中，find() + replace() + strip() + startswith()/endswith() 这几个占了我90%的字符串查找替换操作。下一篇我们学习分割与拼接——处理CSV、日志、路径等结构化文本的核心技能。

以上就是Python基础指南之字符串查找与替换的常用方法详解的详细内容，更多关于Python字符串查找与替换的资料请关注脚本之家其它相关文章！

您可能感兴趣的文章:

python rsa和Crypto.PublicKey.RSA 模块详解
这篇文章主要介绍了python rsa和Crypto.PublicKey.RSA 模块,本文给大家介绍的非常详细，对大家的学习或工作具有一定的参考借鉴价值，需要的朋友可以参考下
2022-04-04
Python实现删除Excel表格中重复行的实用方法
在整理客户名单、导入调查数据或合并多个数据源时,Excel 表格中很容易出现重复记录,本文将介绍 4 种删除 Excel 重复行的方法,大家可以根据需要进行选择
2026-03-03
使用Python来开发Markdown脚本扩展的实例分享
这篇文章主要介绍了使用Python来开发Markdown脚本扩展的实例分享,文中的示例是用来简单地转换文档结构,主要为了体现一个思路,需要的朋友可以参考下
2016-03-03
详解python中读取和查看图片的6种方法
本文主要介绍了详解python中读取和查看图片的6种方法，文中通过示例代码介绍的非常详细，对大家的学习或者工作具有一定的参考学习价值，需要的朋友们下面随着小编来一起学习学习吧
2022-04-04
Python实现CSV转TXT格式(单文件+批量处理)
CSV格式因结构简洁、易与表格软件兼容而被广泛使用,但TXT格式具有更强的通用性、更低的存储冗余,下面我们就来看看如何使用Python实现二者的转换吧
2026-01-01
使用pycharm和pylint检查python代码规范操作
这篇文章主要介绍了使用pycharm和pylint检查python代码规范操作，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧
2020-06-06
vscode配置anaconda3的方法步骤
这篇文章主要介绍了vscode配置anaconda3的方法步骤，文中通过示例代码介绍的非常详细，对大家的学习或者工作具有一定的参考学习价值，需要的朋友们下面随着小编来一起学习学习吧
2020-08-08
浅谈pandas中DataFrame关于显示值省略的解决方法
下面小编就为大家分享一篇浅谈pandas中DataFrame关于显示值省略的解决方法，具有很好的参考价值，希望对大家有所帮助。一起跟随小编过来看看吧
2018-04-04
python中引用与复制用法实例分析
这篇文章主要介绍了python中引用与复制用法,以实例形式详细分析了python中引用与复制的功能与相关使用技巧,需要的朋友可以参考下
2015-06-06
使用Python pip怎么升级pip
这篇文章主要介绍了使用Python pip怎么升级pip,本文给大家分享方法和实现步骤对python pip升级pip相关知识感兴趣的朋友跟随小编一起看看吧
2020-08-08