共计 2531 个字符,预计需要花费 7 分钟才能阅读完成。
博主使用的是pyhive,目前发现github作者还在一直维护当中附上链接pyhive
作为工具,我们直接使用就好了,当然你的服务器也要开启对用的服务,后面我们才可以通过客户端连接使用
在macos下面可以通过安装下面的包即可
pip install pyhive
pip install thift
pip install sasl
pip install thrift-sasl
下面贴出一个简单的小例子
# -*- coding: utf-8 -*-
# @Time : 2018/6/19 上午11:32
# @Author : zhusimaji
# @File : python_hive.py
# @Software: PyCharm
from pyhive import hive
PORT=10000
name = "*****"
password = "*****"
conn=hive.Connection(host="your host", port=PORT, username=name,database='stg',auth='LDAP',password=password)
cursor = conn.cursor()
cursor.execute("SELECT col1 FROM table123 LIMIT 10")
for result in cursor.fetchall():
print(result)
一般情况下默认的端口都是10000,下面看下Connection类的初始化参数
def __init__(self, host=None, port=None, username=None, database='default', auth=None,
configuration=None, kerberos_service_name=None, password=None,
thrift_transport=None):
:param host: host参数
:param port:hive服务端口 Defaults to 10000.
:param auth: The value of hive.server2.authentication used by HiveServer2. 认证参数
Defaults to ``NONE``.
:param configuration: A dictionary of Hive settings (functionally same as the `set` command) hive的参数
:param kerberos_service_name: Use with auth='KERBEROS' only
:param password: Use with auth='LDAP' or auth='CUSTOM' only 如果前面的auth参数是ldap或者costom则需要输入密码
:param thrift_transport: A ``TTransportBase`` for custom advanced usage.
Incompatible with host, port, auth, kerberos_service_name, and password.
所以你在上面的测试代码中看到我们使用了LDAP认证方式,需要输入对应的账号密码
Connection有几个常见的方法简单说明一下
#close顾名思义就是关闭连接呗
def close(self):
"""Close the underlying session and Thrift transport"""
req = ttypes.TCloseSessionReq(sessionHandle=self._sessionHandle)
response = self._client.CloseSession(req)
self._transport.close()
_check_status(response)
#commit一般只有类似mysql提供服务支持,hive还是算了不支持
def commit(self):
"""Hive does not support transactions, so this does nothing."""
pass
#cursor 游标,后续用于提交sql语句查询
def cursor(self, *args, **kwargs):
"""Return a new :py:class:`Cursor` object using the connection."""
return Cursor(self, *args, **kwargs)
@property
def client(self):
return self._client
@property
def sessionHandle(self):
return self._sessionHandle
#mysql支持事务,所有可以rollback,当然hive不支持
def rollback(self):
raise NotSupportedError("Hive does not have transactions") # pragma: no cover
下面再来看看Cursor
class Cursor(common.DBAPICursor):
"""These objects represent a database cursor, which is used to manage the context of a fetch
operation.
Cursors are not isolated, i.e., any changes done to the database by a cursor are immediately
visible by other cursors or connections.
"""
def __init__(self, connection, arraysize=1000):
self._operationHandle = None
super(Cursor, self).__init__()
self._arraysize = arraysize
self._connection = connection
这个类是从DBAPICursor继承过来的,在DBAPICursor已经定义了很多方法,在前面的样例代码中我们使用了cursor.fetchall(),其中fetchall就是在父类中定义的
大概描述就是这么多。。。。。
正文完
请博主喝杯咖啡吧!