基于Python fminunc 的替代方法

2025-06-27 10:24:25

最近闲着没事，想把coursera上斯坦福ML课程里面的练习，用Python来实现一下，一是加深ML的基础，二是熟悉一下numpy，matplotlib，scipy这些库。

在EX2中，优化theta使用了matlab里面的fminunc函数，不知道Python里面如何实现。搜索之后，发现stackflow上有人提到用scipy库里面的minimize函数来替代。我尝试直接调用我的costfunction和grad，程序报错，提示(3,)和(100,1)dim维度不等，gradient vector不对之类的，试了N多次后，终于发现问题何在。。

首先来看看使用np.info(minimize)查看函数的介绍，传入的参数有：

fun : callable
 The objective function to be minimized.

  ``fun(x, *args) -> float``

 where x is an 1-D array with shape (n,) and `args`
 is a tuple of the fixed parameters needed to completely
 specify the function.
x0 : ndarray, shape (n,)
 Initial guess. Array of real elements of size (n,),
 where 'n' is the number of independent variables.
args : tuple, optional
 Extra arguments passed to the objective function and its
 derivatives (`fun`, `jac` and `hess` functions).
method : str or callable, optional
 Type of solver. Should be one of

  - 'Nelder-Mead' :ref:`(see here) <optimize.minimize-neldermead>`
  - 'Powell'  :ref:`(see here) <optimize.minimize-powell>`
  - 'CG'   :ref:`(see here) <optimize.minimize-cg>`
  - 'BFGS'  :ref:`(see here) <optimize.minimize-bfgs>`
  - 'Newton-CG' :ref:`(see here) <optimize.minimize-newtoncg>`
  - 'L-BFGS-B' :ref:`(see here) <optimize.minimize-lbfgsb>`
  - 'TNC'   :ref:`(see here) <optimize.minimize-tnc>`
  - 'COBYLA'  :ref:`(see here) <optimize.minimize-cobyla>`
  - 'SLSQP'  :ref:`(see here) <optimize.minimize-slsqp>`
  - 'trust-constr':ref:`(see here) <optimize.minimize-trustconstr>`
  - 'dogleg'  :ref:`(see here) <optimize.minimize-dogleg>`
  - 'trust-ncg' :ref:`(see here) <optimize.minimize-trustncg>`
  - 'trust-exact' :ref:`(see here) <optimize.minimize-trustexact>`
  - 'trust-krylov' :ref:`(see here) <optimize.minimize-trustkrylov>`
  - custom - a callable object (added in version 0.14.0),
   see below for description.

 If not given, chosen to be one of ``BFGS``, ``L-BFGS-B``, ``SLSQP``,
 depending if the problem has constraints or bounds.
jac : {callable, '2-point', '3-point', 'cs', bool}, optional
 Method for computing the gradient vector. Only for CG, BFGS,
 Newton-CG, L-BFGS-B, TNC, SLSQP, dogleg, trust-ncg, trust-krylov,
 trust-exact and trust-constr. If it is a callable, it should be a
 function that returns the gradient vector:

  ``jac(x, *args) -> array_like, shape (n,)``

 where x is an array with shape (n,) and `args` is a tuple with
 the fixed parameters. Alternatively, the keywords
 {'2-point', '3-point', 'cs'} select a finite
 difference scheme for numerical estimation of the gradient. Options
 '3-point' and 'cs' are available only to 'trust-constr'.
 If `jac` is a Boolean and is True, `fun` is assumed to return the
 gradient along with the objective function. If False, the gradient
 will be estimated using '2-point' finite difference estimation.

需要注意的是fun关键词参数里面的函数，需要把优化的theta放在第一个位置，X,y，放到后面。并且，theta在传入的时候一定要是一个一维shape（n,）的数组，不然会出错。

然后jac是梯度，这里的有两个地方要注意，第一个是传入的theta依然要是一个一维shape(n,)，第二个是返回的梯度也要是一个一维shape(n,)的数组。

总之，关键在于传入的theta一定要是一个1D shape(n,)的，不然就不行。我之前为了方便已经把theta塑造成了一个（n,1）的列向量，导致使用minimize时会报错。所以，学会用help看说明可谓是相当重要啊~

import numpy as np
import pandas as pd
import scipy.optimize as op

def LoadData(filename):
 data=pd.read_csv(filename,header=None)
 data=np.array(data)
 return data

def ReshapeData(data):
 m=np.size(data,0)
 X=data[:,0:2]
 Y=data[:,2]
 Y=Y.reshape((m,1))
 return X,Y

def InitData(X):
 m,n=X.shape
 initial_theta = np.zeros(n + 1)
 VecOnes = np.ones((m, 1))
 X = np.column_stack((VecOnes, X))
 return X,initial_theta

def sigmoid(x):
 z=1/(1+np.exp(-x))
 return z

def costFunction(theta,X,Y):
 m=X.shape[0]
 J = (-np.dot(Y.T, np.log(sigmoid(X.dot(theta)))) - \
   np.dot((1 - Y).T, np.log(1 - sigmoid(X.dot(theta))))) / m
 return J

def gradient(theta,X,Y):
 m,n=X.shape
 theta=theta.reshape((n,1))
 grad=np.dot(X.T,sigmoid(X.dot(theta))-Y)/m
 return grad.flatten()

if __name__=='__main__':
 data = LoadData('ex2data1csv.csv')
 X, Y = ReshapeData(data)
 X, initial_theta = InitData(X)
 result = op.minimize(fun=costFunction, x0=initial_theta, args=(X, Y), method='TNC', jac=gradient)
 print(result)

最后结果如下，符合MATLAB里面用fminunc优化的结果（fminunc:cost:0.203,theta:-25.161,0.206,0.201）

  fun: array([0.2034977])
  jac: array([8.95038682e-09, 8.16149951e-08, 4.74505693e-07])
 message: 'Local minimum reached (|pg| ~= 0)'
 nfev: 36
  nit: 17
 status: 0
 success: True
  x: array([-25.16131858, 0.20623159, 0.20147149])

此外，由于知道cost在0.203左右，所以我用最笨的梯度下降试了一下，由于后面实在是太慢了，所以设置while J>0.21，循环了大概13W次。。可见，使用集成好的优化算法是多么重要。。。还有，在以前的理解中，如果一个学习速率不合适，J会一直发散，但是昨天的实验发现，有的速率开始会发散，后面还是会收敛。

以上这篇基于Python fminunc 的替代方法就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持我们。

python将文本中的空格替换为换行的方法

测试文本 jb51.txt welcome to jb51.net I love you very much python代码 # -*- coding: utf-8 -*- ''' 遇到文中的空格就换行 ''' def delblankline(infile, outfile): infopen = open(infile, 'r',encoding="utf-8") outfopen = open(outfile, 'w',encoding="utf-8") d
在PyCharm中批量查找及替换的方法

选中需要操作的字符 Ctrl + R 替换 Ctrl + Shift + F 全局查找 Ctrl + Shift + R 全局替换以上这篇在PyCharm中批量查找及替换的方法就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持我们.
基于Python fminunc 的替代方法

最近闲着没事,想把coursera上斯坦福ML课程里面的练习,用Python来实现一下,一是加深ML的基础,二是熟悉一下numpy,matplotlib,scipy这些库. 在EX2中,优化theta使用了matlab里面的fminunc函数,不知道Python里面如何实现.搜索之后,发现stackflow上有人提到用scipy库里面的minimize函数来替代.我尝试直接调用我的costfunction和grad,程序报错,提示(3,)和(100,1)dim维度不等,gradient vect
基于Python List的赋值方法

Python中关于对象复制有三种类型的使用方式,赋值.浅拷贝与深拷贝.他们既有区别又有联系,刚好最近碰到这一类的问题,研究下. 一.赋值在python中,对象的赋值就是简单的对象引用,这点和C++不同.如下: a = [1,2,3,"hello",["python","C++"]] b = a print a==b #True 这种情况下,b和a是一样的,他们指向同一片内存,b不过是a的别名,是引用.我们可以使用a与b是否相同来判断,返回Tru
基于python时间处理方法(详解)

在处理数据和进行机器学习的时候,遇到了大量需要处理的时间序列.比如说:数据库读取的str和time的转化,还有time的差值计算.总结一下python的时间处理方面的内容. 一.字符串和时间序列的转化 time.strptime():字符串=>时间序列 time.strftime():时间序列=>字符串 import time start = "2017-01-01" end = "2017-8-12" startTime = time.strptime
基于python(urlparse)模板的使用方法总结

一.简介 urlparse模块用户将url解析为6个组件,并以元组形式返回,返回的6个部分,分别是:scheme(协议).netloc(网络位置).path(路径).params(路径段参数).query(查询).fragment(片段). 二.功能列举 1.urlparse.urlparse()(将url解析为组件,url必须以http://开头) >>> urlparse.urlparse("https://i.cnblogs.com/EditPosts.aspx?opt=
基于进程内通讯的python聊天室实现方法

本文实例讲述了基于进程内通讯的python聊天室实现方法.分享给大家供大家参考.具体如下: #!/usr/bin/env python # Added by <ctang@redhat.com> import sys import os from multiprocessing import connection ADDR = ('', 9997) AUTH_KEY = '12345' class Server(object): def __init__(self, username): se
Python基于PycURL实现POST的方法

本文实例讲述了Python基于PycURL实现POST的方法.分享给大家供大家参考.具体如下: import pycurl import StringIO import urllib url = "http://www.google.com/" post_data_dic = {"name":"value"} crl = pycurl.Curl() crl.setopt(pycurl.VERBOSE,1) crl.setopt(pycurl.FO
python基于multiprocessing的多进程创建方法

本文实例讲述了python基于multiprocessing的多进程创建方法.分享给大家供大家参考.具体如下: import multiprocessing import time def clock(interval): while True: print ("the time is %s"% time.time()) time.sleep(interval) if __name__=="__main__": p = multiprocessing.Process
Python基于PycURL自动处理cookie的方法

本文实例讲述了Python基于PycURL自动处理cookie的方法.分享给大家供大家参考.具体如下: import pycurl import StringIO url = "http://www.google.com/" crl = pycurl.Curl() crl.setopt(pycurl.VERBOSE,1) crl.setopt(pycurl.FOLLOWLOCATION, 1) crl.setopt(pycurl.MAXREDIRS, 5) crl.fp = Strin
基于python 处理中文路径的终极解决方法

1 .据说python3就没有这个问题了 2 .u'字符串' 代表是unicode格式的数据,路径最好写成这个格式,别直接跟字符串'字符串'这类数据相加,相加之后type就是str,这样就会存在解码失误的问题. 别直接跟字符串'字符串'这类数据相加别直接跟字符串'字符串'这类数据相加别直接跟字符串'字符串'这类数据相加 unicode类型别直接跟字符串'字符串'这类数据相加说四遍 3 .有些读取的方式偏偏是要读取str类型的路径,不是unicode类型的路径,那么我们把这个str.enco
基于python list对象中嵌套元组使用sort时的排序方法

在list中嵌套元组,在进行sort排序的时候,产生的是原数组的副本,排序过程中,先根据第一个字段进行从小到大排序,如果第一个字段相同的话,再根据第二个字段进行排序,依次类推,当涉及到字母的时候,是按照字典序进行排序. 如下: a = [(1, 'B'), (1, 'A'), (1, 'C'), (1, 'AC'), (2, 'B'), (2, 'A'), (1, 'ABC')] a a.sort() a 输出结果为: [(1, 'B'), (1, 'A'), (1, 'C'), (1, 'AC

基于Python fminunc 的替代方法

相关推荐

随机推荐