pyhive使用简明教程

8,935次阅读次阅读
没有评论

pyhive使用简明教程

博主使用的是pyhive,目前发现github作者还在一直维护当中附上链接pyhive

作为工具,我们直接使用就好了,当然你的服务器也要开启对用的服务,后面我们才可以通过客户端连接使用

在macos下面可以通过安装下面的包即可

pip install pyhive
pip install thift
pip install sasl
pip install thrift-sasl

下面贴出一个简单的小例子

# -*- coding: utf-8 -*-
# @Time    : 2018/6/19 上午11:32
# @Author  : zhusimaji
# @File    : python_hive.py
# @Software: PyCharm

from pyhive import hive

PORT=10000
name = "*****"
password = "*****"
conn=hive.Connection(host="your host", port=PORT, username=name,database='stg',auth='LDAP',password=password)

cursor = conn.cursor()
cursor.execute("SELECT col1 FROM table123 LIMIT 10")
for result in cursor.fetchall():
    print(result)

一般情况下默认的端口都是10000,下面看下Connection类的初始化参数

def __init__(self, host=None, port=None, username=None, database='default', auth=None,
                 configuration=None, kerberos_service_name=None, password=None,
                 thrift_transport=None):
        :param host: host参数
        :param port:hive服务端口 Defaults to 10000.
        :param auth: The value of hive.server2.authentication used by HiveServer2. 认证参数
            Defaults to ``NONE``.
        :param configuration: A dictionary of Hive settings (functionally same as the `set` command) hive的参数
        :param kerberos_service_name: Use with auth='KERBEROS' only
        :param password: Use with auth='LDAP' or auth='CUSTOM' only  如果前面的auth参数是ldap或者costom则需要输入密码
        :param thrift_transport: A ``TTransportBase`` for custom advanced usage.
            Incompatible with host, port, auth, kerberos_service_name, and password.

所以你在上面的测试代码中看到我们使用了LDAP认证方式,需要输入对应的账号密码 Connection有几个常见的方法简单说明一下

    #close顾名思义就是关闭连接呗
    def close(self):
        """Close the underlying session and Thrift transport"""
        req = ttypes.TCloseSessionReq(sessionHandle=self._sessionHandle)
        response = self._client.CloseSession(req)
        self._transport.close()
        _check_status(response)
    #commit一般只有类似mysql提供服务支持,hive还是算了不支持
    def commit(self):
        """Hive does not support transactions, so this does nothing."""
        pass
    #cursor 游标,后续用于提交sql语句查询
    def cursor(self, *args, **kwargs):
        """Return a new :py:class:`Cursor` object using the connection."""
        return Cursor(self, *args, **kwargs)

    @property
    def client(self):
        return self._client

    @property
    def sessionHandle(self):
        return self._sessionHandle
    #mysql支持事务,所有可以rollback,当然hive不支持
    def rollback(self):
        raise NotSupportedError("Hive does not have transactions")  # pragma: no cover

下面再来看看Cursor

class Cursor(common.DBAPICursor):
    """These objects represent a database cursor, which is used to manage the context of a fetch
    operation.

    Cursors are not isolated, i.e., any changes done to the database by a cursor are immediately
    visible by other cursors or connections.
    """

    def __init__(self, connection, arraysize=1000):
        self._operationHandle = None
        super(Cursor, self).__init__()
        self._arraysize = arraysize
        self._connection = connection

这个类是从DBAPICursor继承过来的,在DBAPICursor已经定义了很多方法,在前面的样例代码中我们使用了cursor.fetchall(),其中fetchall就是在父类中定义的 大概描述就是这么多。。。。。

admin
版权声明:本站原创文章,由admin2018-06-19发表,共计3772字。
转载提示:除特殊说明外本站文章皆由CC-4.0协议发布,转载请注明出处。
评论(没有评论)