matplotlib中plt.hist()参数解释及应用实例

2025-01-29 11:44:45

一、plt.hist()参数详解

简介：
plt.hist()：直方图，一种特殊的柱状图。
将统计值的范围分段，即将整个值的范围分成一系列间隔，然后计算每个间隔中有多少值。
直方图也可以被归一化以显示“相对”频率。然后，它显示了属于几个类别中的每个类别的占比，其高度总和等于1。

import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib.pyplot import MultipleLocator
from matplotlib import ticker
%matplotlib inline

plt.hist(x, bins=None, range=None, density=None, weights=None, cumulative=False, bottom=None, histtype='bar', align='mid', orientation='vertical', rwidth=None, log=False, color=None, label=None, stacked=False, normed=None, *, data=None, **kwargs)

常用参数解释：
x: 作直方图所要用的数据，必须是一维数组；多维数组可以先进行扁平化再作图；必选参数；
bins: 直方图的柱数，即要分的组数，默认为10；
range：元组(tuple)或None；剔除较大和较小的离群值，给出全局范围；如果为None，则默认为(x.min(), x.max())；即x轴的范围；
density：布尔值。如果为true，则返回的元组的第一个参数n将为频率而非默认的频数；
weights：与x形状相同的权重数组；将x中的每个元素乘以对应权重值再计数；如果normed或density取值为True，则会对权重进行归一化处理。这个参数可用于绘制已合并的数据的直方图；
cumulative：布尔值；如果为True，则计算累计频数；如果normed或density取值为True，则计算累计频率；
bottom：数组，标量值或None；每个柱子底部相对于y=0的位置。如果是标量值，则每个柱子相对于y=0向上/向下的偏移量相同。如果是数组，则根据数组元素取值移动对应的柱子；即直方图上下便宜距离；
histtype：{‘bar’, ‘barstacked’, ‘step’, ‘stepfilled’}；'bar’是传统的条形直方图；'barstacked’是堆叠的条形直方图；'step’是未填充的条形直方图，只有外边框；‘stepfilled’是有填充的直方图；当histtype取值为’step’或’stepfilled’，rwidth设置失效，即不能指定柱子之间的间隔，默认连接在一起；
align：{‘left’, ‘mid’, ‘right’}；‘left’：柱子的中心位于bins的左边缘；‘mid’：柱子位于bins左右边缘之间；‘right’：柱子的中心位于bins的右边缘；
orientation：{‘horizontal’, ‘vertical’}：如果取值为horizontal，则条形图将以y轴为基线，水平排列；简单理解为类似bar()转换成barh()，旋转90°；
rwidth：标量值或None。柱子的宽度占bins宽的比例；
log：布尔值。如果取值为True，则坐标轴的刻度为对数刻度；如果log为True且x是一维数组，则计数为0的取值将被剔除，仅返回非空的(frequency, bins, patches）；
color：具体颜色，数组（元素为颜色）或None。
label：字符串（序列）或None；有多个数据集时，用label参数做标注区分；
stacked：布尔值。如果取值为True，则输出的图为多个数据集堆叠累计的结果；如果取值为False且histtype=‘bar’或’step’，则多个数据集的柱子并排排列；
normed: 是否将得到的直方图向量归一化，即显示占比，默认为0，不归一化；不推荐使用，建议改用density参数；
edgecolor: 直方图边框颜色；
alpha: 透明度；

返回值（用参数接收返回值，便于设置数据标签）：
n：直方图向量，即每个分组下的统计值，是否归一化由参数normed设定。当normed取默认值时，n即为直方图各组内元素的数量（各组频数）；
bins: 返回各个bin的区间范围；
patches：返回每个bin里面包含的数据，是一个list。
其他参数与plt.bar()类似。

二、plt.hist()简单应用

import matplotlib.pyplot as plt
%matplotlib inline

# 最简单，只传递x，组数，宽度，范围
plt.hist(data13['carrier_no'], bins=11, rwidth=0.8, range=(1,12), align='left')
plt.show()

三、plt.bar()综合应用

import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib.pyplot import MultipleLocator
from matplotlib import ticker
%matplotlib inline

plt.figure(figsize=(8,5), dpi=80)
# 拿参数接收hist返回值，主要用于记录分组返回的值，标记数据标签
n, bins, patches = plt.hist(data13['carrier_no'], bins=11, rwidth=0.8, range=(1,12), align='left', label='xx直方图')
for i in range(len(n)):
    plt.text(bins[i], n[i]*1.02, int(n[i]), fontsize=12, horizontalalignment="center") #打标签，在合适的位置标注每个直方图上面样本数
plt.ylim(0,16000)
plt.title('直方图')
plt.legend()
# plt.savefig('直方图'+'.png')
plt.show()

附官方参数解释

Parameters
----------
x : (n,) array or sequence of (n,) arrays
    Input values, this takes either a single array or a sequence of
    arrays which are not required to be of the same length.

bins : int or sequence or str, optional
    If an integer is given, ``bins + 1`` bin edges are calculated and
    returned, consistent with `numpy.histogram`.

    If `bins` is a sequence, gives bin edges, including left edge of
    first bin and right edge of last bin.  In this case, `bins` is
    returned unmodified.

    All but the last (righthand-most) bin is half-open.  In other
    words, if `bins` is::

        [1, 2, 3, 4]

    then the first bin is ``[1, 2)`` (including 1, but excluding 2) and
    the second ``[2, 3)``.  The last bin, however, is ``[3, 4]``, which
    *includes* 4.

    Unequally spaced bins are supported if *bins* is a sequence.

    With Numpy 1.11 or newer, you can alternatively provide a string
    describing a binning strategy, such as 'auto', 'sturges', 'fd',
    'doane', 'scott', 'rice' or 'sqrt', see
    `numpy.histogram`.

    The default is taken from :rc:`hist.bins`.

range : tuple or None, optional
    The lower and upper range of the bins. Lower and upper outliers
    are ignored. If not provided, *range* is ``(x.min(), x.max())``.
    Range has no effect if *bins* is a sequence.

    If *bins* is a sequence or *range* is specified, autoscaling
    is based on the specified bin range instead of the
    range of x.

    Default is ``None``

density : bool, optional
    If ``True``, the first element of the return tuple will
    be the counts normalized to form a probability density, i.e.,
    the area (or integral) under the histogram will sum to 1.
    This is achieved by dividing the count by the number of
    observations times the bin width and not dividing by the total
    number of observations. If *stacked* is also ``True``, the sum of
    the histograms is normalized to 1.

    Default is ``None`` for both *normed* and *density*. If either is
    set, then that value will be used. If neither are set, then the
    args will be treated as ``False``.

    If both *density* and *normed* are set an error is raised.

weights : (n, ) array_like or None, optional
    An array of weights, of the same shape as *x*.  Each value in *x*
    only contributes its associated weight towards the bin count
    (instead of 1).  If *normed* or *density* is ``True``,
    the weights are normalized, so that the integral of the density
    over the range remains 1.

    Default is ``None``.

    This parameter can be used to draw a histogram of data that has
    already been binned, e.g. using `np.histogram` (by treating each
    bin as a single point with a weight equal to its count) ::

        counts, bins = np.histogram(data)
        plt.hist(bins[:-1], bins, weights=counts)

    (or you may alternatively use `~.bar()`).

cumulative : bool, optional
    If ``True``, then a histogram is computed where each bin gives the
    counts in that bin plus all bins for smaller values. The last bin
    gives the total number of datapoints. If *normed* or *density*
    is also ``True`` then the histogram is normalized such that the
    last bin equals 1. If *cumulative* evaluates to less than 0
    (e.g., -1), the direction of accumulation is reversed.
    In this case, if *normed* and/or *density* is also ``True``, then
    the histogram is normalized such that the first bin equals 1.

    Default is ``False``

bottom : array_like, scalar, or None
    Location of the bottom baseline of each bin.  If a scalar,
    the base line for each bin is shifted by the same amount.
    If an array, each bin is shifted independently and the length
    of bottom must match the number of bins.  If None, defaults to 0.

    Default is ``None``

histtype : {'bar', 'barstacked', 'step',  'stepfilled'}, optional
    The type of histogram to draw.

    - 'bar' is a traditional bar-type histogram.  If multiple data
      are given the bars are arranged side by side.

    - 'barstacked' is a bar-type histogram where multiple
      data are stacked on top of each other.

    - 'step' generates a lineplot that is by default
      unfilled.

    - 'stepfilled' generates a lineplot that is by default
      filled.

    Default is 'bar'

align : {'left', 'mid', 'right'}, optional
    Controls how the histogram is plotted.

        - 'left': bars are centered on the left bin edges.

        - 'mid': bars are centered between the bin edges.

        - 'right': bars are centered on the right bin edges.

    Default is 'mid'

orientation : {'horizontal', 'vertical'}, optional
    If 'horizontal', `~matplotlib.pyplot.barh` will be used for
    bar-type histograms and the *bottom* kwarg will be the left edges.

rwidth : scalar or None, optional
    The relative width of the bars as a fraction of the bin width.  If
    ``None``, automatically compute the width.

    Ignored if *histtype* is 'step' or 'stepfilled'.

    Default is ``None``

log : bool, optional
    If ``True``, the histogram axis will be set to a log scale. If
    *log* is ``True`` and *x* is a 1D array, empty bins will be
    filtered out and only the non-empty ``(n, bins, patches)``
    will be returned.

    Default is ``False``

color : color or array_like of colors or None, optional
    Color spec or sequence of color specs, one per dataset.  Default
    (``None``) uses the standard line color sequence.

    Default is ``None``

label : str or None, optional
    String, or sequence of strings to match multiple datasets.  Bar
    charts yield multiple patches per dataset, but only the first gets
    the label, so that the legend command will work as expected.

    default is ``None``

stacked : bool, optional
    If ``True``, multiple data are stacked on top of each other If
    ``False`` multiple data are arranged side by side if histtype is
    'bar' or on top of each other if histtype is 'step'

    Default is ``False``

normed : bool, optional
    Deprecated; use the density keyword argument instead.

Returns
-------
n : array or list of arrays
    The values of the histogram bins. See *density* and *weights* for a
    description of the possible semantics.  If input *x* is an array,
    then this is an array of length *nbins*. If input is a sequence of
    arrays ``[data1, data2,..]``, then this is a list of arrays with
    the values of the histograms for each of the arrays in the same
    order.  The dtype of the array *n* (or of its element arrays) will
    always be float even if no weighting or normalization is used.

bins : array
    The edges of the bins. Length nbins + 1 (nbins left edges and right
    edge of last bin).  Always a single array even when multiple data
    sets are passed in.

patches : list or list of lists
    Silent list of individual patches used to create the histogram
    or list of such list if multiple input datasets.

Other Parameters
----------------
**kwargs : `~matplotlib.patches.Patch` properties

See also
--------
hist2d : 2D histograms

Notes
-----

.. note::
    In addition to the above described arguments, this function can take a
    **data** keyword argument. If such a **data** argument is given, the
    following arguments are replaced by **data[<arg>]**:

    * All arguments with the following names: 'weights', 'x'.

    Objects passed as **data** must support item access (``data[<arg>]``) and
    membership test (``<arg> in data``).

到此这篇关于matplotlib中plt.hist()参数解释及应用实例的文章就介绍到这了,更多相关matplotlib plt.hist()参数内容请搜索我们以前的文章或继续浏览下面的相关文章希望大家以后多多支持我们！

matplotlib 画动态图以及plt.ion()和plt.ioff()的使用详解

学习python的道路是漫长的,今天又遇到一个问题,所以想写下来自己的理解方便以后查看. 在使用matplotlib的过程中,常常会需要画很多图,但是好像并不能同时展示许多图.这是因为python可视化库matplotlib的显示模式默认为阻塞(block)模式.什么是阻塞模式那?我的理解就是在plt.show()之后,程序会暂停到那儿,并不会继续执行下去.如果需要继续执行程序,就要关闭图片.那如何展示动态图或多个窗口呢?这就要使用plt.ion()这个函数,使matplotlib的显示模式转换
matplotlib常见函数之plt.rcParams、matshow的使用(坐标轴设置)

1.plt.rcParams plt(matplotlib.pyplot)使用rc配置文件来自定义图形的各种默认属性,称之为"rc配置"或"rc参数". 通过rc参数可以修改默认的属性,包括窗体大小.每英寸的点数.线条宽度.颜色.样式.坐标轴.坐标和网络属性.文本.字体等.rc参数存储在字典变量中,通过字典的方式进行访问. 代码: import numpy as np import matplotlib.pyplot as plt ###%matplotlib in
Python matplotlib通过plt.scatter画空心圆标记出特定的点方法

在用python画散点图的时候想标记出特定的点,比如在某些点的外围加个空心圆,一样可以通过plt.scatter实现 import matplotlib.pyplot as plt x = [[1, 3], [2, 5]] y = [[4, 7], [6, 3]] for i in range(len(x)): plt.plot(x[i], y[i], color='r') plt.scatter(x[i], y[i], color='b') plt.scatter(x[i], y[i], co
python matplotlib:plt.scatter() 大小和颜色参数详解

语法 plt.scatter(x, y, s=20, c='b') 大小s默认为20,s=0时点不显示:颜色c默认为蓝色. 为每一个点指定大小和颜色有时我们需要为每一个点指定大小和方向,以区分不同的点.这时,可以向s和c传入列表.如: import matplotlib.pyplot as plt import numpy as np x = list(range(1, 7)) plt.scatter(x, x, s=10*np.array(x)**2, c=x) plt.show() 参数s
Python Matplotlib通过plt.subplots创建子绘图

目录前言一.只有子图的绘制二.单个方向堆叠子图三.行列方向扩展子图四.共享轴五.极坐标子图前言 plt.subplots调用后将会产生一个图表(Figure)和默认网格(Grid),与此同时提供一个合理的控制策略布局子绘图. 一.只有子图的绘制如果没有提供参数给subplots将会返回: Figure一个Axes对象例子: fig, ax = plt.subplots() ax.plot(x, y) ax.set_title('A single plot') 二.单个方向堆叠子
matplotlib 曲线图和折线图 plt.plot()实例

我就废话不多说了,大家还是直接看代码吧! 绘制曲线: import time import numpy as np import matplotlib.pyplot as plt x = np.linspace(0, 10, 1000) y = np.sin(x) plt.figure(figsize=(6,4)) plt.plot(x,y,color="red",linewidth=1 ) plt.xlabel("x") #xlabel.ylabel:分别设置X.
matplotlib 使用 plt.savefig() 输出图片去除旁边的空白区域

最近在作图时需要将输出的图片紧密排布,还要去掉坐标轴,同时设置输出图片大小. 要让程序自动将图表保存到文件中,代码为: plt.savefig('squares_plot.png', bbox_inches='tight') 第一个实参指定要以什么样的文件名保存图表,这个文件将存储到scatter_squares.py所在的目录中. 第二个实参指定将图表多余的空白区域裁减掉.如果要保留图表周围多余的空白区域,可省略这个实参. 但是发现matplotlib使用plt.savefig()保存的图片
matplotlib中plt.hist()参数解释及应用实例

目录一.plt.hist()参数详解二.plt.hist()简单应用三.plt.bar()综合应用附官方参数解释一.plt.hist()参数详解简介:plt.hist():直方图,一种特殊的柱状图.将统计值的范围分段,即将整个值的范围分成一系列间隔,然后计算每个间隔中有多少值.直方图也可以被归一化以显示“相对”频率. 然后,它显示了属于几个类别中的每个类别的占比,其高度总和等于1. import matplotlib as mpl import matplotlib.pyplot a
关于python中plt.hist参数的使用详解

如下所示: matplotlib.pyplot.hist( x, bins=10, range=None, normed=False, weights=None, cumulative=False, bottom=None, histtype=u'bar', align=u'mid', orientation=u'vertical', rwidth=None, log=False, color=None, label=None, stacked=False, hold=None, **kwarg
MySQL5.7中 performance和sys schema中的监控参数解释(推荐)

1.performance schema:介绍在MySQL5.7中,performance schema有很大改进,包括引入大量新加入的监控项.降低占用空间和负载,以及通过新的sys schema机制显著提升易用性.在监控方面,performance schema有如下功能: ①:元数据锁: 对于了解会话之间元数据锁的依赖关系至关重要.从MySQL5.7.3开始,就可以通过metadata_locks表来了解元数据锁的相关信息: --哪些会话拥有哪些元数据锁 --哪些会话正在等待元数据锁
Python中zip()函数的解释和可视化(实例详解)

zip()的作用先看一下语法: zip(iter1 [,iter2 [...]]) -> zip object Python的内置help()模块提供了一个简短但又有些令人困惑的解释: 返回一个元组迭代器,其中第i个元组包含每个参数序列或可迭代对象中的第i个元素.当最短的可迭代输入耗尽时,迭代器将停止.使用单个可迭代参数,它将返回1元组的迭代器.没有参数,它将返回一个空的迭代器. 与往常一样,当您精通更一般的计算机科学和Python概念时,此模块非常有用.但是,对于初学者来说,这段话只会引发更
在matplotlib中改变figure的布局和大小实例

以下来自Stack Overflow 从上面我们可以很清晰的看出应该如何使用matplotlib的figure方法. 补充知识:matplotlib 设置图形大小时 figsize 与 dpi 的关系 matplotlib 中设置图形大小的语句如下: fig = plt.figure(figsize=(a, b), dpi=dpi) 其中: figsize 设置图形的大小,a 为图形的宽, b 为图形的高,单位为英寸 dpi 为设置图形每英寸的点数则此时图形的像素为: px, py = a*d
PowerShell函数中的开关参数介绍和创建实例

本文介绍什么是开关参数,在PowerShell自定义函数中,如何创建开关参数并使用开关参数的值. 什么叫开关参数呢?举个例子,技术男一般都知道有一个网络命令叫"Ping",我们可以使用"ping www.jb51.net"这样一个命令来检查本地计算机到www.jb51.net这个网站所在的服务器网络是否连通.这个命令会从本地发送4个数据包到www.jb51.net服务器,并显示每个数据包是否收到了反馈结果.如果我正在重启www.jb51.net这台服务器,那么pin
Shell脚本中的位置变量参数（特殊字符）实例讲解

$# : 传递到脚本的参数个数 $* : 以一个单字符串显示所有向脚本传递的参数.与位置变量不同,此选项参数可超过 9个 $$ : 脚本运行的当前进程 ID号 $! : 后台运行的最后一个进程的进程 ID号 $@ : 与$#相同,但是使用时加引号,并在引号中返回每个参数 $- : 显示shell使用的当前选项,与 set命令功能相同 $? : 显示最后命令的退出状态. 0表示没有错误,其他任何值表明有错误. 复制代码代码如下: #!/bin/sh #param.sh # $0:文件完整路径名
关于matplotlib及相关cmap参数的取值方式

目录 matplotlib及相关cmap参数的取值 matplotlib中各种图形参数解释柱状图bar的使用散点图scatter的使用折线图plot的使用箱型图boxplot的使用饼图pie的使用 matplotlib及相关cmap参数的取值在matplotlib中对于图片的显示有如下方法(这不是重点), 其中有cmap=‘binary’的参数. plt.imshow(imgs[i].reshape(28, 28), cmap='binary') #或如下:也可以达到相同的效果 pl
Matplotlib中rcParams使用方法

主要作用为指定图片像素: matplotlib.rcParams['figure.figsize']#图片像素 matplotlib.rcParams['savefig.dpi']#分辨率 plt.savefig('plot123_2.png', dpi=200)#指定分辨率 %matplotlib inline import matplotlib # 注意这个也要import一次 import matplotlib.pyplot as plt from IPython.core.pylabto
Pytorch中关于BatchNorm2d的参数解释

目录 BatchNorm2d中的track_running_stats参数 running_mean和running_var参数 BatchNorm2d参数讲解总结 BatchNorm2d中的track_running_stats参数如果BatchNorm2d的参数val,track_running_stats设置False,那么加载预训练后每次模型测试测试集的结果时都不一样: track_running_stats设置为True时,每次得到的结果都一样. running_mean和runn

matplotlib中plt.hist()参数解释及应用实例

目录

一、plt.hist()参数详解

二、plt.hist()简单应用

三、plt.bar()综合应用

附官方参数解释

相关推荐

随机推荐