Azure Open AI Service 문서 번역/생성 Service 24-08-07

Notice

Recent Posts

Recent Comments

Link

« 2025/03 »
일	월	화	수	목	금	토
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Tags more

Archives

Today

Total

관리 메뉴

Nathaniel

Azure Open AI Service 문서 번역/생성 Service 24-08-07 본문

카테고리 없음

Azure Open AI Service 문서 번역/생성 Service 24-08-07

Nathaniel1 2024. 8. 7. 23:05

오늘도 정말 기분 좋은 하루다.. 열심히 수업을 들어보자 오늘은 내가 했던 것들을 꼭 다시 적어서 쓸 수 있도록.... 노력을 해보겠다 ....

대화 언어 이해 (CLU : Conversation Languqge Understanding) Azure AI Service이다.

CLU 만드는 단계는 다음과 같다

데이터 선택 → 라벨 데이터 → 모델 학습 → 모델 성능 확인 → 모델 향상 → 모델 배포 → 예측 엔터티

CLU 작업을 다시 시작해보자!!

실습 환경 Azure Portal → 내가 만들었던 Service 그룹 선택 → language sutdio로 가 보자구!!!

그리고 계정 리소스를 선택을 해야 하는데,

아래 사진처럼 소유자 권한이 있는 리소스를 선택하면 여태까지 서비스에서 만들었던 데이터 시트들이 상단에 위치하게 되는 것을 확인할 수 있다.

소유자 권한이 있는 리소스 그룹에서 Azure AI Services를 만든다!

다 만들어졌으면 리소스로 이동!!

리소스그룹에서 → AI 스튜디오로 이동~

나는 언어+ 번역기를 사용하기 위해서 아래 기능을 사용하겠다!

문서 번역을 눌러서 들어간다!!

허브가 없기 때문에 허브 선택 → 새 허브 만들기를 누른다!

아래의 화면처럼 허브 만들기를 눌러 생성한다!

문서 번역하는 기능을 눌러 단어, PPT, Excel, Text 별로 나눠진 곳들을 각각 한 개씩 선택해 유형별로 영문 → 한글로 번역되는지 확인한다.

단어 버튼을 누르면 무작위로 워드 샘플 docx 파일이 생성되고, 번역을 누르면 영문 워드가 한글 워드로 변경된다.

파일 확인을 해보면 영문 문서가 한글 문서로 번역된 것으로 확인된다.

아래 코드를 입력하면 Docx 영문본을 한글본으로 번역해주는 코드이다.

다른 문서인 txt, ppt를 번역하고 싶다면, 아래 마소 링크에 들어가서 확장자에 맞는 document 코드를 적어넣으면 문서별로 번역이 가능하다.

아래 코드에서 document에 → "application/vnd.openxmlformats-officedocument.wordprocessingml.document" 이런식으로 써넣으면 된다 word 파일 형식으로 문서가 번역되는 것을 만들 수 있다.

https://aka.ms/dtsync-content-type

import requests
import os

#Construct URL
endpoint = "Your-endpoint"
path = "/translator/document:translate"
url = endpoint + path

headers = {
 "Ocp-Apim-Subscription-Key": "Your-key"
}

# Define the parameters
# Get list of supported languages and code here: https://aka.ms/TranslatorLanguageCodes
params = {
 "sourceLanguage": "en",
 "targetLanguage": "ko",
 "api-version": "2023-11-01-preview"
}

# Include full path, file name and extension
input_file = "./WordSample.docx"
output_file = "./WordSample_ko.docx"

# Open the input file in binary mode
with open(input_file, "rb") as document:
 # Define the data to be sent
 # Find list of supported content types here: https://aka.ms/dtsync-content-type
 data = {
 "document": (os.path.basename(input_file), document, "application/vnd.openxmlformats-officedocument.wordprocessingml.document")
 }

 # Send the POST request
 response = requests.post(url, headers=headers, files=data, params=params)

# Write the response content to a file
with open(output_file, "wb") as output_document:
 output_document.write(response.content)

이번엔 Document intelligence 서비스를 이용해보려고 한다!!

생각보다 Azure AI Service에는 기능이 너무 많다.... 다 써보는 경우는 회사에서는 가능하지 않을까 생각한다.

아래는 Document intelligence를 만들어준다 물론 소유자 권한 리소스그룹으로 이동하기~

다 만들었으면~ 스튜디오로 이동한다~ Go to Document intelligence Studio!!!!

이번엔 UI가 새롭다. 여태까지 본 적이 없는 UI다

스크롤을 더 내리면 인보이스, 영수증 등 문서 작업에 특화 된 모델을 만들 수 있다~

하지만, Read를 가장 많이 쓰기 때문에 가장 먼저, Document Analysis의 기능인 Read를 선택한다.

들어가면 아래와 같은 UI가 나오게 되고, 오른쪽에 있는 'Run Analysis'를 누른다.

오른쪽에 있는 Run Analysis를 누르면 모델이 문서를 Text 별로 분석 한다.

오른쪽 상단에서 Result를 누르면 Json 파일로 정리된 것들을 볼 수 있다.

코드를 눌러서 어떻게 구성되어있는지 구경도 해본다.

UI 정면 상단을 기준으로 Run analysis를 눌러서 또 다른 sample 문서들을 눌러보면서 추출된 것들을 볼 수 있다.

생각보다 문서를 Text 별로 나누어주는 기능이 매우 간편하다!

다음 장을 Analysis해보면 바코드도 인식이 되는 것을 확인할 수 있다.

이번엔 Analyze options를 선택해서 선택 사항을 모두 누르고 다시 Run Analysis를 누르면~ QR 코드를 인식할 수 있다.

ㅋㅋㅋㅋㅋ 구글 번역기에 사진 기능으로 사진 찍어서 해보다가 Azure AI Service에서는 정확하게 인식되니깐 신기하긴하다

세상이 진짜 많이 발전되고 있는 것을 직접적으로 느껴보니깐 나도 발전을 많이 해야겠다는 생각을 한다.

도면 인식에서는 분석을 해보면, 수학 공식(Formulas)을 인식했다는 것을 확인할 수 있다. ㅋㅋㅋㅋ 참... 언제봐도 신기하네

도면인식 코드를 VS 코드에 입력해보았다.

이것 또한, Endpoint와 Key는 Document 리소스 그룹에서 만든 화면에서 확인해야 한다!!

Key, Endpoint는 반드시 잘 확인할 것.. 자꾸 리소스그룹 만들다보니깐 여기저기 왔다갔다해서 너무 헷갈리는 게 흠이다 😅

"""
This code sample shows Prebuilt Read operations with the Azure Form Recognizer client library. 
The async versions of the samples require Python 3.6 or later.

To learn more, please visit the documentation - Quickstart: Document Intelligence (formerly Form Recognizer) SDKs
https://learn.microsoft.com/azure/ai-services/document-intelligence/quickstarts/get-started-sdks-rest-api?pivots=programming-language-python
"""

from azure.core.credentials import AzureKeyCredential
from azure.ai.formrecognizer import DocumentAnalysisClient

"""
Remember to remove the key from your code when you're done, and never post it publicly. For production, use
secure methods to store and access your credentials. For more information, see 
https://docs.microsoft.com/en-us/azure/cognitive-services/cognitive-services-security?tabs=command-line%2Ccsharp#environment-variables-and-application-configuration
"""
endpoint = "YOUR_FORM_RECOGNIZER_ENDPOINT"
key = "YOUR_FORM_RECOGNIZER_KEY"

def format_bounding_box(bounding_box):
    if not bounding_box:
        return "N/A"
    return ", ".join(["[{}, {}]".format(p.x, p.y) for p in bounding_box])

def analyze_read():
    # sample document
    formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"

    document_analysis_client = DocumentAnalysisClient(
        endpoint=endpoint, credential=AzureKeyCredential(key)
    )
    
    poller = document_analysis_client.begin_analyze_document_from_url(
            "prebuilt-read", formUrl)
    result = poller.result()

    print ("Document contains content: ", result.content)
    
    for idx, style in enumerate(result.styles):
        print(
            "Document contains {} content".format(
                "handwritten" if style.is_handwritten else "no handwritten"
            )
        )

    for page in result.pages:
        print("----Analyzing Read from page #{}----".format(page.page_number))
        print(
            "Page has width: {} and height: {}, measured with unit: {}".format(
                page.width, page.height, page.unit
            )
        )

        for line_idx, line in enumerate(page.lines):
            print(
                "...Line # {} has text content '{}' within bounding box '{}'".format(
                    line_idx,
                    line.content,
                    format_bounding_box(line.polygon),
                )
            )

        for word in page.words:
            print(
                "...Word '{}' has a confidence of {}".format(
                    word.content, word.confidence
                )
            )

    print("----------------------------------------")


if __name__ == "__main__":
    analyze_read()

이번엔 Layout을 만들어본다.. 그냥 이전 문서 분석 기능들과 비슷한 것(?) 같다.